Polypeptides having thyrotropin-receptor activity, nucleic acid sequences coding for such receptors and polypeptides, and applications of these polypeptides

ABSTRACT

The invention concerns a polypeptide possessing thyrotropin receptor activity characterized in that it comprises the amino acid sequence shown in FIG.  11  (i.e., SEQ ID NO: 59) and cDNA molecules encoding this polypeptide. The invention also relates to vectors comprising cDNAs which encode the polypeptide of FIG.  11,  cells transformed with these vectors, and the use the transformed cells expressing the thyrotropin receptor as in the detection of TSH or antibodies against the thyrotropin receptor.

The invention relates to polypeptides having thyrotropin-receptoractivity, to nucleic acids coding for such polypeptides, to antibodiesto these polypeptides and to the use of the polypeptides and antibodiesin assay methods.

The literature references indicated by numbers in parentheses in thisspecification are listed in the form of a bibliography at the end of thedescription.

Pituitary glycoproteins (Luteinizing hormone, LH; follicle stimulatinghormone, FHS; and thyroid stimulating hormone or thyrotropin, TSH) forma family of closely related hormones.

The pituitary hormone thyrotropin (TSH) is the main physiological agentregulating the thyroid gland. It stimulates the function and theproliferation of thyrocytes and induces the expression ofdifferentiation (1). Most of its effects are mediated by cyclic AMP(cAMP) (1). As the other pituitary and placental glycoprotein hormones(FSH, LH, CG), TSH is a heterodimer. All these hormones share anidentical alpha subunit; the beta subunit, despite sequence similarity,is specific for each (2). The activated TSH, FSH and LH-CG receptorsstimulate adenylyl cyclase in their target cells via mechanisms mediatedby the G protein Gs (3). In man, the TSH receptor may be the target ofautoimmune reactions leading to hyper- or hypo-stimulation of thethyroid gland by autoantibodies in Grave's disease and in idiopathicmyxoedema, respectively (4).

A prerequisite to studies of such diseases and to the elucidation ofreceptor structure and function is the availability of receptorpreparations, particularly human, at a reasonable cost and in relativeabundance.

To date, particulate membrane preparations and detergent-solubilisedthyroid membranes, often of porcine or bovine origin (4) have been usedin such studies. Human receptor preparations are not only costly but arealso difficult to reproduce identically. Furthermore, the knownpreparations cannot be considered to be “purified” receptors; they areenriched with respect to their receptor content but do not allowpurification of the receptor to a degree which would enable a partialsequence analysis, and hence its cloning. These receptor preparationshave never allowed characterisation of the entity responsible for theTSH-binding activity.

Cloning and expression of the related LH-CG receptor has recently beenachieved. A cDNA for the rat LH-CG receptor was isolated with use of aDNA probe generated in a polymerase chain reaction with oligonucleotideprimers based on peptide sequences of purified receptor protein (15).Variants of the porcine LH-CG receptor were cloned by screening a λgt11library with cDNA probes isolated with monoclonal antibodies (16).

Attempts have been made to clone the TSH receptor (6) using a methodwhich exploits the sequence similarity displayed by all known G-proteincoupled receptors. A pair of oligonucleotide primers corresponding totransmembrane segments III and VI were used on cDNA from thyroid tissueunder conditions allowing amplification of the primed sequences by thepolymerase chain reaction. The method did not allow cloning of the TSHreceptor but led instead to the cloning of four new members of theG-protein coupled receptor family.

The difficulties encountered in purifying and in cloning the TSHreceptor are thought to be due to its extra-ordinary low abundance evenin thyroid cells.

The present inventors have succeeded in cloning the TSH receptor andvariants thereof, firstly by applying the technique described in (6) butwith different sets of primers, and with human genomic DNA as thetemplate, rather than cDNA and secondly by use of a selected sequenceamplified by this technique as a probe.

BRIEF DESCRIPTION OF THE DRAWINGS

Certain aspects of the invention are illustrated in the FIGS. 1 to 12.Figures illustrating amino-acid sequences use the one-letterabbreviation system.

FIG. 1 is a sequence comparison of clone HGMP09 with a panel ofG-protein coupled receptors (6 and ref. therein). Only the sequencearound transmembrane segment III of each receptor is shown in the oneletter code. Residues conserved in HGMP09 and in more than 50% of theother receptors are indicated by an asterisk. The “DRY” and “Asp113”residues (9) are indicated by {circumflex over ( )}. Sequences HGMP09through RDC1 are listed s SEQ ID NO:35 through SEQ ID NO:53 in theattached SEQUENCE LISTING.

FIGS. 2a-2 d show the primary structure of the dog TSH receptor (SEQ IDNO:75), as deduced from the nucleic acid sequence of dTSHr. The sequencewas aligned (17) with full-length rat and pig LH-CG sequences (SEQ IDNO:55 and SEQ ID NO:56, respectively) (15, 16) and with HGMP09 partialsequence. Numbering is given from the first residue predicted in themature polypeptide by von Heihne algorithm (11). Identical residues andconservative replacements in TSHr and LH-CGr are indicated by * and .,respectively. Sites for N glycosylation are underlined. Putativetransmembrane segments are overlined. Lambda phages containing dTSHrinserts were subcloned in M13 and sequenced on both strands (AppliedBiosystems model 370A) using a combination of forced cloning andexonuclease III deletions (21).

FIG. 2e is a dendogram constructed from the sequences of G-proteincoupled receptors. The CLUSTAL algorithm (17) was used to construct adendogram from the sequences of 22 receptors (6) and references therein)including rat and pig LH-CG receptors (16, 17), HGMP09 and the TSHreceptor. For each receptor, a segment corresponding to the knownsequence of HGMP09 (131 residues, extending from transmembrane segmentsII to V) was used for comparison by the program.

FIG. 3a shows TSH induced morphological changes in Y1 cellsmicroinjected with TSH receptor mRNA. Y1 cells were microinjected withrecombinant TSH receptor mRNA (0.1 pl at 0.25 ug/ul) (right) or water(left) as described (13) and incubated in control medium (upper panel)or with TSH (0.1 nM) (lower panel). RO 201724 andinsobutylmethylxanthine (10⁻⁶ M each) were present in all incubations.

FIG. 3b shows TSH induced cAMP accumulation in Xenopus oocytesmicroinjected with TSH receptor mRNA. Xenopus oocytes were handled asdescribed (22) and injected with water (open symbols) or recombinant TSHreceptor mRNA (13) (50 nl at 0.1 ug/ul) (filled symbols). After 3 daysin control medium, batches of 35 oocytes were incubated for 90 min. inmedium supplemented with various concentrations of TSH (circles), LH(squares) or FSH (triangles). cAMP was determined as described (14). RO201724 and isobutylmethylxanthine (10⁻⁶ M each) were present in allincubations. Incubation of control oocytes in forskolin at 10⁻⁴ Mresulted in doubling of the cAMP concentration (not shown).

FIG. 4 illustrates the displacement of ¹²⁵I TSH receptors expressed incos7 cells. Cos7 cells were transfected with TSH receptor cDNA subclonedin pSVL (23). After 72 hours, cells were harvested and a membranefraction was prepared (24). Membranes were similarly prepared from wildtype cos7 cells and from dog thyrocytes in primary culture (20). Bindingof ¹²⁵I TSH (TRAK Henning) was performed at 0° C. for 120 min. in thepresence of various concentrations of competitors (TSH-Armour, FSH andLH, UCB bioproducts). Bound radioactivity was separated bycentrifugation and counted. Results are expressed as percent ¹²⁵I TSHbound by transfected cells in the absence of competitor (3,000 cpm) overnon-specific binding (radioactivity bound in the presence of 100 nM coldTSH, 800 cpm). Open and filled circles represent displacement by coldTSH from cos7 and thyrocyte membranes respectively. Open and filledsquares represent displacement from cos7 by LH and FSH, respectively.Diamonds represent control cos7 cells in presence of various amounts ofcold TSH.

FIGS. 5a-5 c show the cDNA sequence coding for the dog TSH receptor (SEQID NO:57), which was expressed in oocytes and culture cells.

FIG. 6 is a schematic representation of the dog thyrotropin receptor,showing the 7 putative transmembrane segments and the large NH2 terminalextracellular domain (to the exclusion of the signal peptide). Theamino-acids deleted in the variant form are indicated in black. The fiveputative glycosylation sites are shown.

FIG. 7 shows the sequence alignment of the repeats constituting theextracellular domain of the thyrotropin receptor amino-acid sequence(SEQ ID NO:58). The signal peptide, as determined by Von Heijnealgorithm is represented in italic. The repeat missing in the molecularvariant of the receptor is indicated by the leftward arrow.

FIGS. 8a and 8 b show the primary structure of the human TSH receptor asdeduced from its cDNA sequence (SEQ ID NO:59). The amino-acid sequencecorresponds to the 2292 nucleotide open reading frame determined fromthe sequencing of two overlapping inserts in lamda gt11 clones (seeexamples). It is aligned for comparison with the dog TSH receptorsequence (only non conserved amino-acids are indicated). Numberingstarts from the first residue of the mature polypeptide as determined byvon Heijne algorithm [11]. Potential N-glycosylation sites areunderlined and putative transmembrane segments are overlined.

FIG. 9 shows the displacement by nonradioactive TSH of [¹²⁵I]TSH fromhuman TSH receptors expressed in cos-7 cells. Results are expressed aspercentage of the [¹²⁵I]-labelled TSH bound by transfected cells in theabsence of competitor (1400 cpm) after correcting for nonspecificbinding (radioactivity bound in the presence of 100 nM unlabelled TSH,240 cpm).

FIG. 10 represents the displacement by immunoglobulins of [¹²⁵I]TSH fromhuman TSH receptor expressed in cos-7 cells. Results are expressed asdescribed in the legend to FIG. 9. Immoglobulins were prepared (seeexamples) from a normal individual (N), from patients with idiopathicmyxoedema (IM1, IM2) or Graves' disease (GD1, GD2). The concentration ofimmunoglobulins in the assay is indicated. The ability of IM1 and IM2(1.5 mg/ml) to inhibit TSH-stimulated cAMP production in a humanthyrocyte assay was 100% and 85%, respectively. The thyroid stimulatingactivity of GD1 and GD2 (1.5 mg/ml) was equivalent to that of 10 mU/mlof TSH, respectively.

FIGS. 11a and 11 b show the primary structure of a TSH receptoraccording to the invention, in which a plurality of letters at any onesite indicates the presence of one of the given amino acid residues atthat site, SEQ ID NO:29 lists the complete sequence. SEQ ID NO:81 andSEQ ID NO:82 list the first and second full sequences, respectively withall possible substitutions included. SEQ ID NO:81 lists the completesequence shown and SEQ ID NO:82 lists the sequence with substitutionshown in the figure.

FIGS. 12a-12 f illustrate the cDNA sequence of the cloned human TSHreceptor (SEQ ID NO:62).

The invention relates to polypeptides possessing thyrotropin receptoractivity, characterised in that they comprise the amino-acid sequenceshown in FIG. 11, or a fragment thereof, or an amino-acid sequencederived from this sequence by substitution or deletion of any of theamino-acid residues indicated in FIG. 11, or by insertion of additionalamino-acid residues. Such derived sequences may show, for example, about80% homology with the sequence of FIG. 11. The polypeptides of theinvention are in substantially pure form, and are preferably in anon-thyroid environment. By ‘substantially pure form’ is meant ‘free ofimpurities’ associated with detergent-solubilised thyroid membranepreparations.

By “TSH-receptor activity” is meant either TSH-binding properties oranti-TSH receptor antibody-binding properties or ability to activateadenylyl cyclase or phospholipase C via G proteins when exposed to TSHor anti-TSHr antibodies. These properties are easily verified bycontacting the polypeptide with for example labelled TSH or labelledanti-TSHr antibodies or by monitoring the adenylyl cyclase activity of amembrane preparation containing the polypeptide. The polypeptides of theinvention include the entire TSH receptor as identified by theinventors, and fragments or variants of this polypeptide as definedbelow. The entire TSH receptor is composed of a signal peptide (20residues) followed by a large putative extracellular domain (398residues) containing 5 sites for N-glycosylation, connected to a 346residue COOH domain containing seven putative transmembrane segments.The amino-acid sequence of the receptor is illustrated in FIG. 11.

More particularly, the invention relates to a polypeptide characterisedin that it comprises an amino-acid sequence represented by the followinggeneral formula:

[x]_(n)−[y]_(m)−[z]_(p)

wherein n=0 or 1; m=0 or 1; p=0 or 1;

with the proviso that n+m+p>0

and x, y and z are defined as follows (using the one-letter amino-acidsymbol and wherein

a plurality of letters at any one site indicates the presence of one ofthe given amino-acid residues at that site (Sequences are listed in fullwith replacement residues in the SEQUENCE LISTING; i.e., x=SEQ ID NO:1or alternately SEQ ID NO:2 with alternate residues shown belowsequence.),

x = MRPADLLQLVLLLDLPRDL,        PP  H A   A   S

y=at least the minimum number of consecutive amino-acids of thefollowing sequence (SEQ ID NO:3 or alternately SEQ ID NO:4 whichcontains alternate residues shown below sequence.) necessary for thepresentation of immunogenic properties:

     GGMGCSSPPCECHQEEDFRVTCKDIQRIPSLPPSTQTLKLI       K  P         D         H   T         FETHLRTIPSHAFSNLPNISRIYVSIDLTLQQLESHSFYNLSKVTHIEIRNTRNLTYIDPD  Q K    R            L   A   R           M         S  SALKELPLLKFLGIFNTGLKMFPDLTKVYSTDIFFILEITDNPYMTSIPVNAFQGLCNETL                  GV   V       V            A   ATLKLYNNGFTSVQGYAFNGTKLDAVYLNKNKYLTVIDKDAFGGVYSGPSLLDVSQTSVTA           I  H                  SA             T     YLPSKGLEHLKELIARNTWTLKKLPLSLSFLHLTRADLSYPSHCCAFKNQKKIRGILESLMCNESSMQSLRQRKSVNALNSPLHQEYEENLGDSIVGYKEKSKFQDTHNNAHYYVFFEEQE     IR         T  G FD     Y    HA   DN Q    DS SDEIIGFGQELKNPQEETLQAFDSHYDYTICGDSEDMVCTPKSDEFNPCED   L                        V  GN

and z=[I-II-II_(i)-III-III_(i)-IV-V-VI-VII-VII_(i)]

wherein the amino-acid sequencesI-II-II_(i)-III-III_(i)-IV-V-VI-VII-VII_(i) are independently present orabsent and have the following meanings:

I = IMGYKFLRIVVWFVSLLALLGNVFVLLILLTSHYK                               IV

(SEQ ID NO:5 or SEQ ID NO:6) or at least 12 consecutive amino-acidresidues of this sequence;

II = LNVPRFLMCNLAFADFCMGMYLLLIASVDLYTHSEYYNHA      T               I I         IH K Q H Y

(SEQ ID NO:7 or SEQ ID NO:8, respectively) or at least 12 consecutiveamino-acid residues of this sequence;

II^(i) = IDWQTGPGC       A

(SEQ ID NO:9 or SEQ ID NO:10, respectively) or at least 2 consecutiveamino-acid residues of this sequence;

III = NTAGFFTVFASELSVYTLTVITL       DA

(SEQ ID NO:11 or SEQ ID NO:12, respectively) or at least 22 consecutiveamino-acid residues of this sequence;

III_(i) = ERWYAITFAMRLD    HT  H  Q

(SEQ ID NO:13 or SEQ ID NO:14, respectively) or at least 2 consecutiveamino-acid residues of this sequence;

IV = RKIRLRHACAIMVGGWVCCFLLALLPLVGISSYAKVSICL     C VQ    YSV   M IFA AA  F IF     M

(SEQ ID NO:15, SEQ ID NO:16 or SEQ ID NO:17, respectively) or at least12 consecutive amino-acid residues of this sequence;

V = PMDTETPLALAYIVFVLTLNIVAFVIVCCCYVKIYITVRN       IDS  SQL VIL  L  VL  I   S                  MSL V

(SEQ ID NO:18, SEQ ID NO:19, or SEQ ID NO:20, respectively) or at least12 consecutive amino-acid residues of this sequence;

VI = PQYNPGDKDTKIAKRMAVLIFTDFICMAPISFYALSAILNKPLIT                             M            LM

(SEQ ID NO:21 or SEQ ID NO:22, respectively) or at least 12 consecutiveamino-acid residues of this sequence;

VII = VSNSKILLVLFYPLNSCANPFLYAIFTKAFQRD        T

(SEQ ID NO:23 or SEQ ID NO:24, respectively) or at least 12 consecutiveamino-acid residues of this sequence;

VII_(i) = VFILLSKFGICKRQAQAYRGQRVPPKNSTDIQVQKVTHDMRQGLHNMEDVYELIENS                       S    AG  I    R    S P  Q E   LHLTPKKQGQISEEYMQTVL     N      K  N

(SEQ ID NO:25 or SEQ ID NO:26, respectively) or at least 12 consecutiveamino-acid residues of this sequence;

it being understood that any of the above-specified amino-acids can bereplaced or deleted, and that extra amino-acid residues may be insertedprovided the thyrotropin receptor activity is maintained.

The sequence represented by [x]_(n) in the above general formulacorresponds to the signal sequence of the TSH receptor. This part of thepolypeptide naturally ensures the transport to the cell membrane of theadjoining [y] and/or [z] fragments, after its production in the cell.Clearly the signal sequence does not need to be present in thepolypeptide in cases where transport to the membrane is not required(for example in in vitro translation of the mRNA encoding thepolypeptide), or may be replaced by other signal sequences to facilitateproduction of the recombinant receptor in certain host cells.

The sequence represented by [z]_(p) in the above general formulacorresponds to the COOH domain of the entire polypeptide containing theseven putative transmembrane fragments I-VII, which show homology withthe corresponding region of other G-protein coupled receptors. Thepolypeptides of the invention include, as indicated above, variants ofthe basic TSH receptor sequence lacking part or all of the transmembranedomain. It is thought that these types of variants may exist naturallyas a result of an alternative splicing phenomenom. By homology withother, known G-protein coupled receptors, the seven putativetransmembrane segments have tentatively been identified as shown in FIG.11 (numbered I to VII). The variant polypeptides of the inventioninclude polypeptides missing some or all of the fragments I-VII_(i) asdefined above, which definition includes the putative extracellular andintracellular “loops” occuring between the transmembrane segments (seeFIG. 6). The transmembrane segment(s) missing may therefore, forexample, be a segment selected from segments I to VII as indicated inFIG. 11 or may be part of one of those segments, or may be atransmembrane segment in conjunction with its adjoining intracellularand/or extracellular loop.

It is also within the terms of the invention to replace some or all ofthe transmembrane domain by the transmembrane domain, or part of thisdomain, of a different receptor, thus giving rise to a hybrid receptor.This type of receptor is particularly interesting in cases where theextracellular part of the TSH receptor needs to be anchored in a cellmembrane having characteristics which are different from, or evenincompatible with, the transmembrane portion of the TSH receptor. It isalso possible to use as the transmembrane domain in a hybrid receptorany amino-acid sequence exhibiting suitable anchoring properties. Such asequence could be entirely synthetic or based on any transmembraneprotein.

It is to be noted that the invention also embraces polypeptides havingthyrotropin receptor activity which lack the entire transmembranedomain. In this case, the polypeptide corresponds to the extracellulardomain of the naturally occuring receptor. This extracellular part ofthe receptor which is apparently responsible for ligand binding, isidentified by the region [y] in the general formula. A polypeptidelacking the entire transmembrane domain is respresented by the generalformula [y]_(m), where m=1, the [z] part of the sequence being absent.This extracellular part of the receptor [y], is characterised by animperfect repeat structure which can be aligned as shown in FIG. 7. Thepolypeptides of the invention include variants in which one or more ofthese repeats is missing. It is however important that sufficientaminoacids be present to allow formation of antibodies (monoclonal orpolyclonal). Such immunogenic amino-acid sequences may comprise forexample 5, 6, 7, 8 or 9 consecutive amino-acids of the “y” sequencedefined above. The immunogenic nature of the fragments of the inventionis tested by injection of the fragment in question into a laboratoryanimal, followed by investigation of the reactivity between anyantibodies thus formed and the immunising fragment.

In particular, the invention encompasses polypeptides in which thesecond repeat (marked by an arrow in FIG. 7) is missing.

The invention also relates to nucleic acid sequences coding for thepolypeptides of the invention as well as the corresponding complementarysequences. Examples of such sequences are those shown in FIGS. 5 and 12,and fragments of these sequences, as well as corresponding degeneratesequences. The nucleic acid fragments embraced by the invention normallyhave at least 8 nucleotides and have preferably at least 12 orpreferably at least 16 nucleotides. Such fragments, or theircomplementary sequences can be used as primers in the amplification ofsegments of DNA using the polymerase chain reaction, for example in theproduction of cDNA coding for the polypeptides having thyrotropinreceptor activity.

The nucleic acid sequences of the invention coding for the entire TSHreceptor are in a genetic environment other than that found naturally inthyroid cells. For example, the genetic environment may be that of aCos-7 cell, a CH0 cell or Y1 cells.

The polypeptides of the invention can be produced in several differentways. For example, a host cell such as COS 7 cells, CHO cells, NIH3T3cells, Xenopus oocytes or Y1 cells can be transformed by a vectorcontaining a nucleic acid insert coding for the desired peptide, inconjunction with all the necessary regulatory elements such as promoter,transcription termination signals etc, or can be microinjected withrecombinant mRNA transcribed from appropriate vectors containing thereceptor encoded sequence. Expression of the insert normally leads tothe insertion of the recombinant polypeptide in the membrane of the cellused as host. In this way, the receptor polypeptide can be used in thisform, the receptor thus being present in a non-thyroidal eukaryoticcellular environment, or in a solubilised membrane fragment form. Thenon-thyroid cells expressing the recombinant receptor exhibit a receptordensity of up to ten times that observed in thyroid cells.

Furthermore, in the case where only a fragment of the polypeptide isrequired, the correspondingly shorter nucleic acid sequence can be usedto transform a suitable host cell, for example, a DNA coding for theputative extracellular region only, or one or more repeats of therepetitive portion of this region. It is also within the terms of theinvention to synthesise the polypeptide chemically, by successiveassembly of the required amino-acid residues. In cases where largerfragments are desired, it is possible to synthesise first a series ofsmaller fragments and to ultimately assemble these fragments to form thelarger fragment.

The invention also relates to antibodies, both polyclonal andmonoclonal, to the thyrotropin-receptor polypeptides. The antibodies ofthe invention are preferably in a purified form, and may be of animalorigin e.g. rabbit or mouse. As mentioned earlier, in man theTSH-receptor may be the target of auto-immune reactions giving rise tohyper- or hypo-stimulation of the thyroid gland by stimulating orblocking autoantibodies respectively. The antigenic nature of thepolypeptides of the invention, particularly the putative extracellulardomain, permits the preparation of antibodies, which can be used in avariety of studies and assays. The TSH-receptor binds both TSH andanti-TSHr antibodies, thus it is possible in certain studies to replaceTSH by anti-TSHr antibodies. The phenomenon of competition betweenlabelled and unlabelled species is particularly useful in such assays.Use of specific fragments of the TSH receptor allows the preparation ofantibodies against defined epitopes, and, by using a panel of suchantibodies, allows further characterisation of the type of disorderpresent in auto-immune patients.

One such assay falling within the terms of the invention is a processfor the quantitative detection of thyrotropine (TSH) or ofanti-thyrotropine receptor antibodies (anti-TSHr) in a biological samplecharacterised in that a polypeptide according to the invention iscontacted with the biological sample suspected of containing TSH oranti-TSHr antibodies and, either simultaneously or subsequently,contacted with labelled TSH, or with labelled anti-TSHr antibodies andthe remaining, bound labelled TSH or bound labelled anti-TSHr antibodiesafter competition between the labelled and unlabelled species, ismeasured.

In this type of assay, the competition between the labelled TSH orlabelled antibodies with the unlabelled TSH or antibodies present in thebiological sample is measured as an indication of the concentration ofthat species in the sample.

Alternatively, instead of using competition between two like-species tomeasure TSH, it is also possible to use a receptor polypeptide to bindthe TSH in the biological sample, and then after washing to add labelledanti-TSH antibodies which selectively detect the bound TSH. This type ofassay can also be carried out using immobilized or solubilised receptorpolypeptide to bind the anti-TSHr-antibody in a biological sample,followed by detection of the bound antibody by labelled antiimmunoglobulins or protein A or protein G or any other agent capable ofrecognizing an antibody.

Another means of assaying the TSH or anti-TSHr antibodies in a sampleexploits the effect which the binding of these species with the TSHreceptor has on the adenylyl cyclase activity of the cell bearing thereceptor. Thus, this aspect of the inventions relates to a process forthe quantitative detection of TSH or of anti-TSHr antibodiescharacterised by contacting intact cells operationally transformed by anucleotide sequence, encoding a polypeptide of the invention or membranepreparations of such cells with the biological sample suspected ofcontaining TSH or anti-TSHr antibodies and measuring in the intact cellsor membranes the change in adenylyl cyclase activity, for example bymeasuring C-AMP generation or release.

The binding of TSH itself or of stimulating anti-TSHr antibodies to thereceptor polypeptide leads to an increase in adenylyl cyclase activity,whereas the binding of blocking anti-TSHr antibodies leads to aninhibition of TSH-induced adenylyl cyclase stimulation. By comparing theadenyl cyclase activity induced by exposure of the receptor polypeptideto TSH with that induced by antibodies in a sample, it is possible,according to the invention, to distinguish blocking antibodies fromstimulating antibodies. In order to quantitatively determine blockingantibodies in a sample, the sample is contacted with the receptorpolypeptides either at the same time as with TSH, or before contactingwith TSH. In this way the adenylyl cyclase stimulating effect of TSH onthe receptor is blocked by the blocking antibodies and is quantified toindicate the concentration of blocking antibodies present in the sample.Such measurements can be carried out in intact cells bearing the TSHreceptors of the invention, or in membrane preparations of such cells.Other effector systems which can be used in this type of detection are,for example, activities of phospholiphase C, protein tyrosine kinase,phospholipase A2 etc.

The labels used in the assays of the invention are those conventionallyused in the art, for example, radioactive labelling, enzymaticlabelling, labelled anti-immunoglobulins, protein A, protein G,depending upon the type of assay.

Another aspect of the invention relates to a process for thequantitative detection of fragments of TSH receptor in a biologicalfluid. Such fragments may be found circulating in patients sufferingfrom thyroid disorders. This aspect of the invention involves contactingthe sample with antibodies according to the invention which havepreviously been labelled, if necessary, and determining the binding, ifany, in the sample by any method involving separation of bound labelledantibody from unbound labelled antibody or by competition between thesaid fragments and a polypeptide according to the invention. In thislatter case it is necessary to label the receptor polypeptide, forexample with ¹²⁵I.

The antibodies of the invention may also be used in theimmunohistochemical detection of TSH receptors, for example inendocrinological investigations or in anatomopathology. In this type ofprocess, the antibodies are again labelled to permit their detection.

The polypeptides of the invention may also be used in the purificationof stimulating or blocking antibodies to TSHr and of TSH by contactingthe polypeptide with a source of TSH or anti-TSHr antibodies, separatingthe bound and free fractions and finally dissociating the receptor-boundentity. If necessary, further successive purification steps known per semay be added. Such a purification process is facilitated by theimmobilisation of the receptor polypeptide, for example in a column orany other solid support.

The invention also embraces kits suitable for the detection of TSH oranti-TSHr antibodies. Such kits are characterised in that they contain:

a) a polypeptide according to the invention and defined above, saidpolypeptide having thyrotropin receptor activity and being either in animmobilised or solibilised form;

b) at least one of the following reagents:

i) labelled TSH

ii) labelled anti-TSHr antibodies

iii) reagents necessary for the measurement of adenylyl cyclaseactivity.

For example, a kit for effecting the detection of autoantibodiesdirected against the TSH receptor by competition would include thepolypeptide of the invention, in immobilised or solubilised form, withlabelled TSH or unlabelled TSH in combination with agents permitting theTSH to be labelled. Alternatively, such a kit might include antibodiesto the TSH receptor and means of labelling them, instead of the TSH.

The invention will be illustrated by the following examples:

EXAMPLES I—Cloning of Dog TSHr

a) Identification of HGMP09

As most G protein-coupled receptor genes do not contain introns in theircoding sequence, a similar strategy to that previously described (6) wasused, but using different sets of degenerated primers and with humangenomic DNA as starting material. Eleven clones displaying sequencesimilarity with G-protein coupled receptors where thus obtained (7). Outof these, one clone (HGMP09) which was amplified with primerscorresponding to transmembrane segments II and VII, presented sequencecharacteristics suggesting that it belonged to a distinct subfamily ofreceptors.

The primers used in this amplification were:

5′ TAGATCTAGACCTGGCGITTGCCGATCT 3′               T  T C GC  T  CA                      G and 5′ ACTTAAGCTTGCAGTAGCCCAIAGGATT 3′                    A  AAAG  G  G

a plurality of nucleotides at any one site indicating the presence ofone of the given nucleotides at that site. Sequences are listedsequentially as SEQ ID NO:51 through SEQ ID NO:55 with alternativenucleotides inserted.

A dendrogram constructed from the alignment shown in FIG. 1 demonstratedthat it is equally distant from all receptors cloned to date (7); inparticular, it does not contain the canonical Asp Arg Tyr (DRY)tripeptide close to transmembrane segment III ⁽⁸⁾ and lacks the Aspresidue implicated in the binding of charged amines is adrenergic(Asp113), muscarinic, dopaminergic and serotonergic receptors (9).

b) Identification of dog TSHr

In the frame of a systematic screening for the expression of the newreceptors in thyroid tissue, HGMP09 was used as a probe both in Northernblotting experiments with thyroid and non-thyroid tissues, and inscreening of a dog thyroid cDNA library. HGMP09 did not hybridize tothyroid mRNA but identified a major 2.6 kb transcript in the ovary andthe testis. However, under moderate conditions of stringency ithybridized to one out of 50,000 thyroid cDNA clones suggestingcross-hybridization with a relatively abundant putative receptor of thethyroid. From these characteristics, it was hypothesized that HGMP09encoded a receptor fragment, distinct from the TSH receptor, but withsequence characteristics expected from close relatives like LH or FSHreceptors. A full-length cross-hybridizing clone (dTSHr) was isolatedand used as a probe in Northern blots of ten different dog tissues. Ithybridized to a 4.9 kb transcript present only in the thyroid gland andin cultured thyrocytes. Interestingly, the signal was much stronger incultured thyrocytes exposed for several days to the cAMP agonistforskolin than in thyroid tissue. This is a characteristic one wouldexpect from the TSH receptor whose expression is known to beup-regulated by cAMP agonists in cultured cells (10). A 4,417 bp cDNAclone was sequenced completely. It contains an open reading frame of 764aminoacids beginning with a 20 residue signal peptide, as predicted byVon Heijne algorithm (11) (FIG. 2a). Comparison to known G-proteincoupled receptors (see hereunder and FIG. 2b) and hydropathy profileanalysis (not shown) demonstrated a 346 residue C-terminal structurewith seven putative transmembrane domains preceded by 398 aminoacidsconstituting a large putative extracellular domain.

c) Expression of dog TSHr

The encoded polypeptide was unambiguously identified as the TSH receptorby expression of the cDNA in a variety of systems. Microinjection ofrecombinant mRNA in adrenocortical Y1 cells and in Xenopus oocytesconferred a TSH responsive phenotype to both systems. Y1 cells respondedto TSH by a characteristic morphological change which is triggered byelevation of cAMP in the cytoplasm (12,13). Xenopus oocytes (FIG. 3)displayed a dose-dependant increase in cAMP which was specific forstimulation by TSH and corresponded to the expected sensivity of the dogreceptor to bovine TSH (half-maximal effect around 0.3 nM) (14).Transient expression of the receptor cDNA was obtained in Cos7 cells(FIG. 4). Specific binding of ¹²⁵I TSH to membranes was observed only intransfected cells. The displacement curve of the label by TSH presentedcharacteristics very similar to that obtained with membranes from dogthyrocytes (half-maximal displacement at 0.4 nM and 0.16 nM for coscells and thyrocytes, respectively) (FIG. 4c). The slight rightwardshift of the displacement curve obtained with Cos7 cell membranes mayreflect the higher amount of receptors in this system.

The cDNA coding for the dog TSH receptor was sequenced completely. Thesequence is given in FIG. 5.

d) Comparison of TSHr with LH-CGr

Comparison of the TSH receptor with the LH-CG receptor cloned recently(15, 16) reveals interesting common characteristics which make themmembers of a new subfamily of G-protein coupled receptors. They bothdisplay a long aminoterminal extension containing multiple sites for Nglycosylation (five in the TSH receptor). The TSH receptor has an extra52 residue insert close to the junction between the putativeextracellular domain and the first transmembrane segment (FIG. 2a). Theoverall sequence similarity between the extracellular domains of the TSHand LH-CG receptors is 45% (FIG. 2a). The similarity between a segmentof soybean lectin and the rat LH receptor (15) is not conserved in theTSH receptor, which suggests that it may be fortuitous. The C-terminalhalf of the TSH receptor containing the transmembrane segments is 70%similar to both the pig and rat LH receptors (FIG. 2a). The homology isparticularly impressive in the transmembrane segments themselves, wherestretches of up to 24 identical residues are observed in a row(transmembrane region III). Also, the carboxyl terminal region of thethird putative intracellular loop, which is particularly short in TSHand LH receptors and which has been implicated in the interaction withG_(αs) (8, 9), is identical in both receptor types. This pattern ofsimilarities gives support to the view that the extracellular domainwould be involved in the recognition of the ligands ⁽⁴⁾, while themembrane-inserted domain would be responsible for the activation ofG_(αs) (15, 16). Together, the TSH and LH-CG receptors, and HGMP09(there is strong preliminary evidence that HGMP09 could actually be theFSH receptor (7)) constitute clearly a distinct subfamily of G-proteincoupled receptors. A sequence similarity dendrogram (17) including mostof the G-protein coupled receptors cloned to date demonstrates boththeir close relationships and their distance from the bulk of the otherreceptors (FIG. 2b). The complete sequence of the FSH receptor willreveal whether the known ability of LH-CG to stimulate the TSH receptor(18) is reflected by a higher sequence similarity of the extracellulardomains of LH and TSH receptors.

e) Identification of a dog TSHr variant

Screening of the dog thyroid cDNA library (30) with the HGMP09 clone(thought to be part of the FSH receptor), gave rise to 16 positiveclones out of the 840,000 screened plaques. Hybridization was carriedout at 42° C. in 35% formamide and the filters were washed at 65° C. in2×SSC, 0.1% SDS before autoradiography. 12 clones were purified tohomogeneity and analyzed by EcoRI digestion. Three clones (dTSHR1,dTSHR2 and dTSHR3) were subcloned in M13mp18 and pBS vectors. dTSHR1 anddTSHR2 consisted of two EcoRI fragments of respectively 2800 and 1500bp. dTSHR3 was shorter, and consisted of 2200 and 1500 bp EcoRIfragments. Restriction analysis of the 2800 bp fragments of dTSHR1 anddTSHR2 revealed slight differences in the restriction map, the maindiscordance being the presence of a PstI restriction site in dTSHR1 andits absence in dTSGR2. dTSHR1 was sequenced completely and revealed anopen reading frame of 764 codons which was identified as the thyrotropinreceptor by expression of the cDNA in oocytes and cell cultures (seeexample I(b)+FIG. 5). dTSHR3 was shown by sequencing to be completelycolinear with dTSHR1 but this cloned lacked 600 bp at its 5′ end.Because of the difference in the restriction map of dTSHR1 and dTSHR2,this latter clone was also sequenced on both strands.

The sequence revealed a number of mutations when compared with thedTSHR1 clone. A total of 5 mutations, including two single basesubstitutions, one single base deletion, one single base insertion andone 5 base insertion were found scattered in the 2060 bp long 3′untranslated region (not shown). However, the main difference betweendTSHR2 and dTSHR1 was located in the coding region, and consisted in a75 bp deletion located 240 bp after the start codon. The corresponding25 amino-acids deletion in the protein itself is located in the long NH2terminal extracellular domain which is characteristic of the TSHreceptor (25) and its recently cloned close relative, the LH receptor(15, 16) (FIG. 6). As in the LH receptor, the NH2 terminal part of thethyrotropin receptor is characterized by an imperfect repeat structurethat can be aligned as indicated in FIG. 7. These repeats are similar instructure to the leucine-rich repeats found in the various proteinscomprising the family of leucine-rich glycoproteins (26, 15), andreferences therein). The deletion in the dTSHR2 clone correspondsexactly to one of these repeats, in a region of the protein where therepeat length is regular and their alignment unambiguous. The existenceof such variant reinforces considerably the significance of thisrepeated structure and sets up interesting questions concerning itsfunctional meaning and the structure of the chromosomal gene.

The extracellular domains of TSH and LH receptors are apparentlyresponsible for the ligand binding (4). The deleted repeat also containsone of the 5 consensus sequences for N-glycosylation. Glycosylation ofthe TSH receptor could be important for ligand binding or signaltransduction. If, and to what extent, the lack of this repeat influencesthe binding capabilities and/or the function of the receptor variant, isnot yet known. Comparison of cell lines expressing this variant with thepresently available stable transfectants expressing the full sizereceptor should partially answer this question. The functional analysisof other in-vitro generated mutants of the TSH receptor will completethe study.

The deletion of a full repeat gives also some insight on the structureof the TSH receptor gene. It is highly probable that the repeat unitcorresponds to a complete exon, and it is therefore possible that allrepeats would be separated by introns. It is interesting to note thatmost of the genes coding for G-protein coupled receptors are completelydevoid of intronic structures. The functional or evolutionarysignificance of this observation is not known, but a highly fragmentedexonic structure of the glycoprotein hormone receptor genes would be inclear contrast to the rest of the receptor family.

II—Cloning of the Human TSHr

A human lambda gt11 cDNA library (29) was screened with a fragment ofthe dog TSHr (25). Out of the 218 clones scored as positive (1/6000), 24were plaque-purified to homogeneity and the size of the inserts wasdetermined. Two clones which harbored inserts of 2370 bp and 3050 bp,respectively, were subcloned as overlapping fragments in M13 derivativesand sequenced (FIG. 12). A total of 4272 bp were determined in which a2292 bp open reading frame was identified. When translated into protein,the coding sequence showed an overall 90.3% similarity with the dog TSHr(FIG. 8) [1]. It displayed all the characteristics described recentlyfor the subfamily of G protein-coupled receptors binding glycoproteinhormones (25, 15, 16); a signal peptide (20 residues) followed by alarge putative extracellular domain (398 residues) containing 5 sitesfor N-glycosylation, connected to a 346 residue carboxyl-terminal domaincontaining seven putative transmembrane segments (FIG. 8). It has beensuggested that the amino-terminal domain, which is not found in other Gprotein-coupled receptors, might correspond to the region involved inthe binding of the bulky pituitary and placental glycoprotein hormones(25, 15, 16).

Variants of the hTSHr

When probed with the putative human TSHr, a Northern blot of RNA fromhuman placenta, testis and thyroid revealeld two major 4.6 and 4.4 kbthyroid-specific transcripts. Minor thyroid-specific RNA species ofsmaller size were also observed. Although the former could simplycorrespond to multiple polyadenylation sites in the 3′ region of thegene, this situation is reminiscent of what has been observed for theporcine LH-CG receptor. In this case, multiple transcripts were found tocorrespond to variants of the receptor cDNA lacking the potential toencode the membrane spanning segments (16). Whether this observationwith important implications on receptor function and regulation alsoapplies to the human TSHr will await sequencing of additional clonesfrom the cDNA library.

Expression of hTSHr

To provide definite proof that the clones isolated encoded a human TSHreceptor, the coding sequence was inserted in the SV40-based vectorpSVL, and the resulting construct transfected in Cos-7 cells (24).Membranes prepared from transfected cells demonstrated specific bindingof [¹²⁵I]TSH (FIG. 9). The unlabelled competitor TSH was bovine. Thecharacteristics of the displacement curve with unlabelled TSH weresimilar to those observed with the dog TSHr assayed under similarconditions (half maximal displacement around 0.5 nM) (25).

From the sequence similarity with dog TSHr, the tissue specificexpression of the corresponding transcripts and the binding studies onmembranes from transfected COS-7 cells, it was concluded that a bonafide human TSHr has been cloned.

Antibodies to hTSHr

To investigate the relevance of the cloned human TSHr to thyroidautoimmunity, competition was tested between [¹²⁵I]TSH andimmunoglobulins prepared from patients, for binding to the recombinantreceptor expressed in Cos-7 cells (FIG. 10). Immunoglobulins wereprepared from the serum of patients or normal individuals by ammoniumsulphate precipitation. They were dissolved in water and dialysedextensively against Dulbecco's modified Eagle medium. Whileimmunoglobulins from normal individuals did not displace [¹²⁵I]TSH,samples from two patients with idiopathic myxoedema clearly did, in adose-dependant manner. The steep dose-response which was observedsuggests that immunoglobulins from these patients present a very highaffinity for the recombinant receptor. When samples from two patientswith Graves' disease having high levels of thyroid stimulatingimmunoglobulins in the circulation were tested, one of them showedlimited ability to displace labelled TSH under the conditions of theassay (FIG. 10). The difference in the potency of these two categoriesof immunoglobulins to displace TSH from the receptor expressed in Cos-7cells may reflect differences in their affinity for a common antigen.Alternatively, despite previous studies suggesting that both stimulatingand blocking antibodies would bind to the same part of the TSHr (26,27), it may correspond to more basic differences in the actual nature oftheir respective antigenic targets. Studies where binding activity of alarger collection of immunoglobulins are compared to their ability toactivate adenylate cyclase in permanently transfected cells will help toclarify this point.

BIBLIOGRAPHY

1. J. E. Dumont, G. Vassart & S. Refetoff, in The Metabolic Bases ofInherited Diseases, C. R. Scrivers, A. L. Beaudet, W. S. Sly & D. Valeeds. McGraw-Hill, pp 1843-1880 (1989).

2. I. A. Kourides, J. A. Gurr & O. Wolw, Rec. Progr. Horm. Res. 40,79-120 (1984)

3. F. Ribeiro-Neto, L. Birnbaumer and J M. B. Field, Mol. Endo. 1, pp482-490 (1987).

4. B. Rees-Smith, S. M. McLachlan & J. Furmaniak, Endocrine Rev. 9,106-121 (1988).

5. R. K. Saiki et al., Science 239, pp. 487-491 (1988).

6. F. Libert et al., Science 244, pp. 569-572 (1988).

7. M. Parmentier et al; to be published elsewhere.

8. B. F. O'Dowd, et al. J. Biol. Chem. 263, 15985-15992 (1988).

9. C. Strader, I. S. Sigal & R. Dixon. FASEB J 3, 1825-1832 (1989)

10. S. Lissitzky, G. Fayet and B. Verrier. Adv. Cyclic Nucl. Res. 5,133-152 (1975).

11. G. von Heijne, Nucl. Acids Res. 14, 4683 (1986).

12. B. P. Schimmer, in Functionally Differentiated cell lines. pp 61-92.G. Sato, ed. AlanR. Riss Inc. (1981) N.Y.

13. C. Maenhaut & F. Libert, submitted. Y1 cells were grown asmonolayers as described (12). 1 mm² areas were marked on the bottom ofthe dishes and all cells in these areas were microinjected with mRNA at0.25 ug/ul in water. mRNA was synthesized from TSH receptor cDNAsubcloned in pSP64 (Promega). After 30 min;, TSH was added and the cellswere photographed 120 min. later. The morphological changes (stable for20 hours) were observed with TSH concentrations down to 0.1 nM. FSH, LHand hCG were ineffective (not shown).

14. J. Van Sande, P. Cochaux and J. E. Dumont. FEBS Lett. 150, 137-141(1982).

15. K. C. McFarland et al., Science 245, 494 (1989).

16. H Loosfelt et al., ibid 245, 525, (1989).

17. D. G. Higgins and P. M. Sharp, Gene, 73, 237-244 (1988).

18 J. G. Kenimer, J. M. Hershman & H. P. Higgins. J. Clin. Endpe. Metab.40, 482 (1975).

19. T. Maniatis, E. F. Fritsch and J. Sambrook, (1982) In MolecularCloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, NewYork).

20. P. Roger et al. Eur. J. Biochem. 152, 239-245 (1985).

21. F. Sanger, S. Nicklen & A. R. Coulson. Proc Natl. Acad. SCi. U.S.A.74, 5463 (1977). S. Henikoff. Gene 28, 351 (1984).

22. B. K. Kobilka et al. J. Biol. Chem. 262, 7321 (1987).

23. G. Wong, Y. S. et al. Science 228, 810-815 (1985).

24. R. A. F. Dixon et al. Nature 326, 73-77 (1987).

25. Parmentier, M., Libert, F., Maenhaut, C., Lefort, A., Gerard, C.,Perret, J., Van Sande, J, Dumont J. E. and Vassart, G. (1989).Submitted.

26. Takahashi, N., Takahashi, Y. and Putman, F. W. (1985). Proc. Natl.Acad. Sci. U.S.A. 82, 1906.

27. Davies Jones, E., Hashim, F., Creagh, F. Williams, S. and ReesSmith, B. (1985) Mol. Cell. Endo. 41, 257-265.

28. Amino, N., Watanabe, Y., Tamaki, H., Iwatani, Y; and Miyai, K.(1987) Clin. Endo. 27, 615-620.

29. Libert, F., Ruel, J., Ludgate, M., Swillens, S., Alexander, N.,Vassart, G. and Dinsart, C. (1987) EMBO J. T, 4193-4196.

30. Lefort, A., Lecocq, R., Libert, F., Lamy, F., Swillens, S., Vassart,G. and Dumont, J. E. (1989) EMBO J. 8, 111-116.

62 19 amino acids amino acid single linear peptide unknown Modified-site/label= Xaa /note= “A or P” Modified-site /label= Xaa /note= “D or P”Modified-site /label= Xaa /note= “Q or H” Modified-site 10 /label= Xaa/note= “V or A” Modified-site 14 /label= Xaa /note= “D or A”Modified-site 18 /label= Xaa /note= “D or S” 1 Met Arg Pro Xaa Xaa LeuLeu Xaa Leu Xaa Leu Leu Leu Xaa Leu Pro 1 5 10 15 Arg Xaa Leu 391 aminoacids amino acid single linear peptide unknown Modified-site /label= Xaa/note= “K or M” Modified-site /label= Xaa /note= “S or P” Modified-site16 /label= Xaa /note= “E or D” Modified-site 26 /label= Xaa /note= “Q orH” Modified-site 30 /label= Xaa /note= “S or T” Modified-site 40 /label=Xaa /note= “L or F” Modified-site 44 /label= Xaa /note= “H or Q”Modified-site 46 /label= Xaa /note= “R or K” Modified-site 51 /label=Xaa /note= “H or R” Modified-site 64 /label= Xaa /note= “L or V”Modified-site 68 /label= Xaa /note= “L or A” Modified-site 72 /label=Xaa /note= “Q or R” Modified-site 84 /label= Xaa /note= “V or M”Modified-site 94 /label= Xaa /note= “N or S” Modified-site 97 /label=Xaa /note= “Y or S” Modified-site 120 /label= Xaa /note= “K or G”Modified-site 121 /label= Xaa /note= “M or V” Modified-site 125 /label=Xaa /note= “L or V” Modified-site 133 /label= Xaa /note= “I or V”Modified-site 146 /label= Xaa /note= “T or A” Modified-site 150 /label=Xaa /note= “V or A” Modified-site 173 /label= Xaa /note= “V or I”Modified-site 176 /label= Xaa /note= “Y or H” Modified-site 195 /label=Xaa /note= “T or S” Modified-site 196 /label= Xaa /note= “V or A”Modified-site 210 /label= Xaa /note= “S or T” Modified-site 216 /label=Xaa /note= “Q or Y” Modified-site 287 /label= Xaa /note= “M or I”Modified-site 288 /label= Xaa /note= “Q or R” Modified-site 298 /label=Xaa /note= “A or T” Modified-site 301 /label= Xaa /note= “S or G”Modified-site 303 /label= Xaa /note= “L or F” Modified-site 304 /label=Xaa /note= “H or D” Modified-site 310 /label= Xaa /note= “N or Y”Modified-site 315 /label= Xaa /note= “I or H” Modified-site 316 /label=Xaa /note= “V or A” Modified-site 320 /label= Xaa /note= “E or D”Modified-site 321 /label= Xaa /note= “K or N” Modified-site 323 /label=Xaa /note= “K or Q” Modified-site 328 /label= Xaa /note= “H or D”Modified-site 329 /label= Xaa /note= “N or S” Modified-site 331 /label=Xaa /note= “A or S” Modified-site 345 /label= Xaa /note= “I or L”Modified-site 370 /label= Xaa /note= “I or V” Modified-site 373 /label=Xaa /note= “D or G” Modified-site 374 /label= Xaa /note= “S or N” 2 GlyGly Xaa Gly Cys Xaa Ser Pro Pro Cys Glu Cys His Gln Glu Xaa 1 5 10 15Asp Phe Arg Val Thr Cys Lys Asp Ile Xaa Arg Ile Pro Xaa Leu Pro 20 25 30Pro Ser Thr Gln Thr Leu Lys Xaa Ile Glu Thr Xaa Leu Xaa Thr Ile 35 40 45Pro Ser Xaa Ala Phe Ser Asn Leu Pro Asn Ile Ser Arg Ile Tyr Xaa 50 55 60Ser Ile Asp Xaa Thr Leu Gln Xaa Leu Glu Ser His Ser Phe Tyr Asn 65 70 7580 Leu Ser Lys Xaa Thr His Ile Glu Ile Arg Asn Thr Arg Xaa Leu Thr 85 9095 Xaa Ile Asp Pro Asp Ala Leu Lys Glu Leu Pro Leu Leu Lys Phe Leu 100105 110 Gly Ile Phe Asn Thr Gly Leu Xaa Xaa Phe Pro Asp Xaa Thr Lys Val115 120 125 Tyr Ser Thr Asp Xaa Phe Phe Ile Leu Glu Ile Thr Asp Asn ProTyr 130 135 140 Met Xaa Ser Ile Pro Xaa Asn Ala Phe Gln Gly Leu Cys AsnGlu Thr 145 150 155 160 Leu Thr Leu Lys Leu Tyr Asn Asn Gly Phe Thr SerXaa Gln Gly Xaa 165 170 175 Ala Phe Asn Gly Thr Lys Leu Asp Ala Val TyrLeu Asn Lys Asn Lys 180 185 190 Tyr Leu Xaa Xaa Ile Asp Lys Asp Ala PheGly Gly Val Tyr Ser Gly 195 200 205 Pro Xaa Leu Leu Asp Val Ser Xaa ThrSer Val Thr Ala Leu Pro Ser 210 215 220 Lys Gly Leu Glu His Leu Lys GluLeu Ile Ala Arg Asn Thr Trp Thr 225 230 235 240 Leu Lys Lys Leu Pro LeuSer Leu Ser Phe Leu His Leu Thr Arg Ala 245 250 255 Asp Leu Ser Tyr ProSer His Cys Cys Ala Phe Lys Asn Gln Lys Lys 260 265 270 Ile Arg Gly IleLeu Glu Ser Leu Met Cys Asn Glu Ser Ser Xaa Xaa 275 280 285 Ser Leu ArgGln Arg Lys Ser Val Asn Xaa Leu Asn Xaa Pro Xaa Xaa 290 295 300 Gln GluTyr Glu Glu Xaa Leu Gly Asp Ser Xaa Xaa Gly Tyr Lys Xaa 305 310 315 320Xaa Ser Xaa Phe Gln Asp Thr Xaa Xaa Asn Xaa His Tyr Tyr Val Phe 325 330335 Phe Glu Glu Gln Glu Asp Glu Ile Xaa Gly Phe Gly Gln Glu Leu Lys 340345 350 Asn Pro Gln Glu Glu Thr Leu Gln Ala Phe Asp Ser His Tyr Asp Tyr355 360 365 Thr Xaa Cys Gly Xaa Xaa Glu Asp Met Val Cys Thr Pro Lys SerAsp 370 375 380 Glu Phe Asn Pro Cys Glu Asp 385 390 35 amino acids aminoacid single linear peptide unknown Modified-site 27 /label= Xaa /note=“L or I” Modified-site 28 /label= Xaa /note= “I or V” 3 Ile Met Gly TyrLys Phe Leu Arg Ile Val Val Trp Phe Val Ser Leu 1 5 10 15 Leu Ala LeuLeu Gly Asn Val Phe Val Leu Xaa Xaa Leu Leu Thr Ser 20 25 30 His Tyr Lys35 40 amino acids amino acid single linear peptide unknown Modified-site/label= Xaa /note= “N or T” Modified-site 18 /label= Xaa /note= “M or I”Modified-site 20 /label= Xaa /note= “M or I” Modified-site 30 /label=Xaa /note= “L or I” Modified-site 31 /label= Xaa /note= “Y or H”Modified-site 33 /label= Xaa /note= “H or K” Modified-site 35 /label=Xaa /note= “E or Q” Modified-site 37 /label= Xaa /note= “Y or H”Modified-site 39 /label= Xaa /note= “Y or H” 4 Leu Xaa Val Pro Arg PheLeu Met Cys Asn Leu Ala Phe Ala Asp Phe 1 5 10 15 Cys Xaa Gly Xaa TyrLeu Leu Leu Ile Ala Ser Val Asp Xaa Xaa Thr 20 25 30 Xaa Ser Xaa Tyr XaaAsn Xaa Ala 35 40 9 amino acids amino acid single linear peptide unknown5 Ile Asp Trp Gln Thr Gly Pro Gly Cys 1 5 9 amino acids amino acidsingle linear peptide unknown 6 Ile Asp Trp Gln Thr Gly Ala Gly Cys 1 523 amino acids amino acid single linear peptide unknown Modified-site/label= Xaa /note= “N or D” Modified-site /label= Xaa /note= “T or A” 7Xaa Xaa Ala Gly Phe Phe Thr Val Phe Ala Ser Glu Leu Ser Val Tyr 1 5 1015 Thr Leu Thr Val Ile Thr Leu 20 13 amino acids amino acid singlelinear peptide unknown Modified-site /label= Xaa /note= “Y or H”Modified-site /label= Xaa /note= “A or T” Modified-site /label= Xaa/note= “F or H” Modified-site 11 /label= Xaa /note= “R or Q” 8 Glu ArgTrp Xaa Xaa Ile Thr Xaa Ala Met Xaa Leu Asp 1 5 10 40 amino acids aminoacid single linear peptide unknown Modified-site /label= Xaa /note= “Ror C” Modified-site /label= Xaa /note= “I or V” Modified-site /label=Xaa /note= “R or Q” Modified-site /label= Xaa /note= “C or Y or A”Modified-site 10 /label= Xaa /note= “A or S” Modified-site 11 /label=Xaa /note= “I or V” Modified-site 15 /label= Xaa /note= “G or M”Modified-site 17 /label= Xaa /note= “V or I” Modified-site 18 /label=Xaa /note= “C or F” Modified-site 19 /label= Xaa /note= “C or A”Modified-site 21 /label= Xaa /note= “L or A” Modified-site 22 /label=Xaa /note= “L or A” Modified-site 25 /label= Xaa /note= “L or F”Modified-site 27 /label= Xaa /note= “L or I” Modified-site 28 /label=Xaa /note= “V or F” Modified-site 34 /label= Xaa /note= “A or M” 9 XaaLys Xaa Xaa Leu Arg His Ala Xaa Xaa Xaa Met Val Gly Xaa Trp 1 5 10 15Xaa Xaa Xaa Phe Xaa Xaa Ala Leu Xaa Pro Xaa Xaa Gly Ile Ser Ser 20 25 30Tyr Xaa Lys Val Ser Ile Cys Leu 35 40 40 amino acids amino acid singlelinear peptide unknown Modified-site /label= Xaa /note= “T or I”Modified-site /label= Xaa /note= “E or D” Modified-site /label= Xaa/note= “T or S” Modified-site /label= Xaa /note= “A or S” Modified-site10 /label= Xaa /note= “L or Q” Modified-site 11 /label= Xaa /note= “A orL” Modified-site 13 /label= Xaa /note= “I or V” Modified-site 14 /label=Xaa /note= “V or I or M” Modified-site 15 /label= Xaa /note= “F or L orS” Modified-site 16 /label= Xaa /note= “V or L” Modified-site 18 /label=Xaa /note= “T or L or V” Modified-site 21 /label= Xaa /note= “I or V”Modified-site 22 /label= Xaa /note= “V or L” Modified-site 25 /label=Xaa /note= “V or I” Modified-site 29 /label= Xaa /note= “C or S” 10 ProMet Asp Xaa Xaa Xaa Pro Leu Xaa Xaa Xaa Tyr Xaa Xaa Xaa Xaa 1 5 10 15Leu Xaa Leu Asn Xaa Xaa Ala Phe Xaa Ile Val Cys Xaa Cys Tyr Val 20 25 30Lys Ile Tyr Ile Thr Val Arg Asn 35 40 45 amino acids amino acid singlelinear peptide unknown Modified-site 25 /label= Xaa /note= “I or M”Modified-site 38 /label= Xaa /note= “I or L” Modified-site 39 /label=Xaa /note= “L or M” 11 Pro Gln Tyr Asn Pro Gly Asp Lys Asp Thr Lys IleAla Lys Arg Met 1 5 10 15 Ala Val Leu Ile Phe Thr Asp Phe Xaa Cys MetAla Pro Ile Ser Phe 20 25 30 Tyr Ala Leu Ser Ala Xaa Xaa Asn Lys Pro LeuIle Thr 35 40 45 33 amino acids amino acid single linear peptide unknown12 Val Ser Asn Ser Lys Ile Leu Leu Val Leu Phe Tyr Pro Leu Asn Ser 1 510 15 Cys Ala Asn Pro Phe Leu Tyr Ala Ile Phe Thr Lys Ala Phe Gln Arg 2025 30 Asp 33 amino acids amino acid single linear peptide unknown 13 ValThr Asn Ser Lys Ile Leu Leu Val Leu Phe Tyr Pro Leu Asn Ser 1 5 10 15Cys Ala Asn Pro Phe Leu Tyr Ala Ile Phe Thr Lys Ala Phe Gln Arg 20 25 30Asp 76 amino acids amino acid single linear peptide unknownModified-site 24 /label= Xaa /note= “P or S” Modified-site 29 /label=Xaa /note= “T or A” Modified-site 30 /label= Xaa /note= “D or G”Modified-site 33 /label= Xaa /note= “V or I” Modified-site 38 /label=Xaa /note= “H or R” Modified-site 43 /label= Xaa /note= “G or S”Modified-site 45 /label= Xaa /note= “H or P” Modified-site 48 /label=Xaa /note= “E or Q” Modified-site 50 /label= Xaa /note= “V or E”Modified-site 54 /label= Xaa /note= “I or L” Modified-site 62 /label=Xaa /note= “K or N” Modified-site 69 /label= Xaa /note= “E or K”Modified-site 72 /label= Xaa /note= “M or N” 14 Val Phe Ile Leu Leu SerLys Phe Gly Ile Cys Lys Arg Gln Ala Gln 1 5 10 15 Ala Tyr Arg Gly GlnArg Val Xaa Pro Lys Asn Ser Xaa Xaa Ile Gln 20 25 30 Xaa Gln Lys Val ThrXaa Asp Met Arg Gln Xaa Leu Xaa Asn Met Xaa 35 40 45 Asp Xaa Tyr Glu LeuXaa Glu Asn Ser His Leu Thr Pro Xaa Lys Gln 50 55 60 Gly Gln Ile Ser XaaGlu Tyr Xaa Gln Thr Val Leu 65 70 75 37 amino acids amino acid singlelinear peptide unknown Modified-site /label= Xaa /note= “M or K”Modified-site /label= Xaa /note= “S or P” Modified-site 15 /label= Xaa/note= “E or D” Modified-site 25 /label= Xaa /note= “Q or H”Modified-site 29 /label= Xaa /note= “S or T” 15 Gly Xaa Met Gly Cys XaaSer Pro Pro Cys Glu Cys His Gln Glu Xaa 1 5 10 15 Asp Phe Arg Val ThrCys Lys Asp Ile Xaa Arg Ile Pro Xaa Leu Pro 20 25 30 Pro Ser Thr Gln Thr35 24 amino acids amino acid single linear peptide unknown Modified-site/label= Xaa /note= “L or F” Modified-site /label= Xaa /note= “H or Q”Modified-site /label= Xaa /note= “R or K” Modified-site 14 /label= Xaa/note= “H or R” 16 Leu Lys Xaa Ile Glu Thr Xaa Leu Xaa Thr Ile Pro SerXaa Ala Phe 1 5 10 15 Ser Asn Leu Pro Asn Ile Ser Arg 20 25 amino acidsamino acid single linear peptide unknown Modified-site /label= Xaa/note= “V or L” Modified-site /label= Xaa /note= “L or A” Modified-site11 /label= Xaa /note= “Q or R” Modified-site 23 /label= Xaa /note= “V orM” 17 Ile Tyr Xaa Ser Ile Asp Xaa Thr Leu Gln Xaa Leu Glu Ser His Ser 15 10 15 Phe Tyr Asn Leu Ser Lys Xaa Thr His 20 25 25 amino acids aminoacid single linear peptide unknown Modified-site /label= Xaa /note= “Nor S” Modified-site 11 /label= Xaa /note= “Y or S” 18 Ile Glu Ile ArgAsn Thr Arg Xaa Leu Thr Xaa Ile Asp Pro Asp Ala 1 5 10 15 Leu Lys GluLeu Pro Leu Leu Lys Phe 20 25 25 amino acids amino acid single linearpeptide unknown Modified-site /label= Xaa /note= “K or G” Modified-site10 /label= Xaa /note= “M or V” Modified-site 14 /label= Xaa /note= “L orV” Modified-site 22 /label= Xaa /note= “I or V” 19 Leu Gly Ile Phe AsnThr Gly Leu Xaa Xaa Phe Pro Asp Xaa Thr Lys 1 5 10 15 Val Tyr Ser ThrAsp Xaa Phe Phe Ile 20 25 25 amino acids amino acid single linearpeptide unknown Modified-site 10 /label= Xaa /note= “T or A”Modified-site 14 /label= Xaa /note= “V or A” 20 Leu Glu Ile Thr Asp AsnPro Tyr Met Xaa Ser Ile Pro Xaa Asn Ala 1 5 10 15 Phe Gln Gly Leu CysAsn Glu Thr Leu 20 25 25 amino acids amino acid single linear peptideunknown Modified-site 12 /label= Xaa /note= “V or I” Modified-site 15/label= Xaa /note= “Y or H” 21 Thr Leu Lys Leu Tyr Asn Asn Gly Phe ThrSer Xaa Gln Gly Xaa Ala 1 5 10 15 Phe Asn Gly Thr Lys Leu Asp Ala Val 2025 24 amino acids amino acid single linear peptide unknown Modified-site/label= Xaa /note= “T or S” Modified-site 10 /label= Xaa /note= “V or A”Modified-site 24 /label= Xaa /note= “S or T” 22 Tyr Leu Asn Lys Asn LysTyr Leu Xaa Xaa Ile Asp Lys Asp Ala Phe 1 5 10 15 Gly Gly Val Tyr SerGly Pro Xaa 20 25 amino acids amino acid single linear peptide unknown23 Leu Leu Asp Val Ser Gln Thr Ser Val Thr Ala Leu Pro Ser Lys Gly 1 510 15 Leu Glu His Leu Lys Glu Leu Ile Ala 20 25 25 amino acids aminoacid single linear peptide unknown 24 Leu Leu Asp Val Ser Tyr Thr SerVal Thr Ala Leu Pro Ser Lys Gly 1 5 10 15 Leu Glu His Leu Lys Glu LeuIle Ala 20 25 23 amino acids amino acid single linear peptide unknown 25Arg Asn Thr Trp Thr Leu Lys Lys Leu Pro Leu Ser Leu Ser Phe Leu 1 5 1015 His Leu Thr Arg Ala Asp Leu 20 25 amino acids amino acid singlelinear peptide unknown 26 Ser Tyr Pro Ser His Cys Cys Ala Phe Lys AsnGln Lys Lys Ile Arg 1 5 10 15 Gly Ile Leu Glu Ser Leu Met Cys Asn 20 2525 amino acids amino acid single linear peptide unknown Modified-site/label= Xaa /note= “M or I” Modified-site /label= Xaa /note= “Q or R”Modified-site 15 /label= Xaa /note= “A or T” Modified-site 18 /label=Xaa /note= “S or G” Modified-site 20 /label= Xaa /note= “L or F”Modified-site 21 /label= Xaa /note= “H or D” 27 Glu Ser Ser Met Xaa XaaLeu Arg Gln Arg Lys Ser Val Asn Xaa Leu 1 5 10 15 Asn Xaa Pro Xaa XaaGln Glu Tyr Glu 20 25 83 amino acids amino acid single linear peptideunknown Modified-site /label= Xaa /note= “N or Y” Modified-site /label=Xaa /note= “I or H” Modified-site /label= Xaa /note= “V or A”Modified-site 12 /label= Xaa /note= “E or D” Modified-site 13 /label=Xaa /note= “K or N” Modified-site 15 /label= Xaa /note= “K or Q”Modified-site 20 /label= Xaa /note= “H or D” Modified-site 21 /label=Xaa /note= “N or S” Modified-site 23 /label= Xaa /note= “A or S”Modified-site 37 /label= Xaa /note= “I or L” Modified-site 62 /label=Xaa /note= “I or V” Modified-site 65 /label= Xaa /note= “D or G”Modified-site 66 /label= Xaa /note= “S or N” 28 Glu Xaa Leu Gly Asp SerXaa Xaa Gly Tyr Lys Xaa Xaa Ser Xaa Phe 1 5 10 15 Gln Asp Thr Xaa XaaAsn Xaa His Tyr Tyr Val Phe Phe Glu Glu Gln 20 25 30 Glu Asp Glu Ile XaaGly Phe Gly Gln Glu Leu Lys Asn Pro Gln Glu 35 40 45 Glu Thr Leu Gln AlaPhe Asp Ser His Tyr Asp Tyr Thr Xaa Cys Gly 50 55 60 Xaa Xaa Glu Asp MetVal Cys Thr Pro Lys Ser Asp Glu Phe Asn Pro 65 70 75 80 Cys Glu Asp 764amino acids amino acid single linear peptide unknown Modified-site/label= Xaa /note= “A or P” Modified-site /label= Xaa /note= “D or P”Modified-site /label= Xaa /note= “Q or H” Modified-site 10 /label= Xaa/note= “V or A” Modified-site 14 /label= Xaa /note= “D or A”Modified-site 18 /label= Xaa /note= “D or S” Modified-site 22 /label=Xaa /note= “M or K” Modified-site 25 /label= Xaa /note= “S or P”Modified-site 35 /label= Xaa /note= “E or D” Modified-site 45 /label=Xaa /note= “Q or H” Modified-site 49 /label= Xaa /note= “S or T”Modified-site 59 /label= Xaa /note= “L or F” Modified-site 63 /label=Xaa /note= “H or Q” Modified-site 65 /label= Xaa /note= “R or K”Modified-site 70 /label= Xaa /note= “H or R” Modified-site 83 /label=Xaa /note= “V or L” Modified-site 87 /label= Xaa /note= “L or A”Modified-site 91 /label= Xaa /note= “Q or R” Modified-site 103 /label=Xaa /note= “V or M” Modified-site 113 /label= Xaa /note= “N or S”Modified-site 116 /label= Xaa /note= “Y or S” Modified-site 139 /label=Xaa /note= “K or G” Modified-site 140 /label= Xaa /note= “M or V”Modified-site 144 /label= Xaa /note= “L or V” Modified-site 152 /label=Xaa /note= “I or V” Modified-site 165 /label= Xaa /note= “T or A”Modified-site 169 /label= Xaa /note= “V or A” Modified-site 192 /label=Xaa /note= “V or I” Modified-site 195 /label= Xaa /note= “Y or H”Modified-site 214 /label= Xaa /note= “T or S” Modified-site 215 /label=Xaa /note= “V or A” Modified-site 229 /label= Xaa /note= “S or T”Modified-site 235 /label= Xaa /note= “Q or Y” Modified-site 306 /label=Xaa /note= “M or I” Modified-site 307 /label= Xaa /note= “Q or R”Modified-site 317 /label= Xaa /note= “A or T” Modified-site 320 /label=Xaa /note= “A or T” Modified-site 322 /label= Xaa /note= “S or G”Modified-site 323 /label= Xaa /note= “L or F” Modified-site 329 /label=Xaa /note= “N or Y” Modified-site 334 /label= Xaa /note= “I or H”Modified-site 335 /label= Xaa /note= “V or A” Modified-site 339 /label=Xaa /note= “E or D” Modified-site 340 /label= Xaa /note= “K or N”Modified-site 342 /label= Xaa /note= “K or Q” Modified-site 347 /label=Xaa /note= “H or D” Modified-site 348 /label= Xaa /note= “N or S”Modified-site 350 /label= Xaa /note= “A or S” Modified-site 364 /label=Xaa /note= “I or L” Modified-site 389 /label= Xaa /note= “I or V”Modified-site 392 /label= Xaa /note= “D or G” Modified-site 393 /label=Xaa /note= “S or N” Modified-site 437 /label= Xaa /note= “L or I”Modified-site 438 /label= Xaa /note= “I or V” Modified-site 447 /label=Xaa /note= “N or T” Modified-site 463 /label= Xaa /note= “M or I”Modified-site 465 /label= Xaa /note= “M or I” Modified-site 475 /label=Xaa /note= “L or I” Modified-site 476 /label= Xaa /note= “Y or H”Modified-site 478 /label= Xaa /note= “H or K” Modified-site 480 /label=Xaa /note= “E or Q” Modified-site 482 /label= Xaa /note= “Y or H”Modified-site 484 /label= Xaa /note= “H or Y” Modified-site 492 /label=Xaa /note= “P or A” Modified-site 495 /label= Xaa /note= “N or D”Modified-site 496 /label= Xaa /note= “T or A” Modified-site 521 /label=Xaa /note= “Y or A” Modified-site 522 /label= Xaa /note= “A or T”Modified-site 525 /label= Xaa /note= “F or H” Modified-site 528 /label=Xaa /note= “R or Q” Modified-site 531 /label= Xaa /note= “R or C”Modified-site 533 /label= Xaa /note= “I or V” Modified-site 534 /label=Xaa /note= “R or Q” Modified-site 539 /label= Xaa /note= “C or Y or A”Modified-site 540 /label= Xaa /note= “A or S” Modified-site 541 /label=Xaa /note= “I or V” Modified-site 545 /label= Xaa /note= “G or M”Modified-site 547 /label= Xaa /note= “V or I” Modified-site 548 /label=Xaa /note= “C or F” Modified-site 549 /label= Xaa /note= “C or A”Modified-site 551 /label= Xaa /note= “L or A” Modified-site 552 /label=Xaa /note= “L or A” Modified-site 555 /label= Xaa /note= “L or F”Modified-site 557 /label= Xaa /note= “L or I” Modified-site 558 /label=Xaa /note= “V or F” Modified-site 564 /label= Xaa /note= “A or M”Modified-site 574 /label= Xaa /note= “T or I” Modified-site 575 /label=Xaa /note= “E or D” Modified-site 576 /label= Xaa /note= “T or S”Modified-site 579 /label= Xaa /note= “A or S” Modified-site 580 /label=Xaa /note= “L” Modified-site 581 /label= Xaa /note= “A or Q”Modified-site 583 /label= Xaa /note= “I or V” Modified-site 584 /label=Xaa /note= “V or I or M” Modified-site 585 /label= Xaa /note= “F or L orS” Modified-site 586 /label= Xaa /note= “V or L” Modified-site 588/label= Xaa /note= “T or L or V” Modified-site 591 /label= Xaa /note= “Ior V” Modified-site 592 /label= Xaa /note= “V or L” Modified-site 595/label= Xaa /note= “V or I” Modified-site 599 /label= Xaa /note= “C orS” Modified-site 635 /label= Xaa /note= “I or M” Modified-site 648/label= Xaa /note= “I or L” Modified-site 649 /label= Xaa /note= “L orM” Modified-site 657 /label= Xaa /note= “S or T” Modified-site 712/label= Xaa /note= “P or S” Modified-site 717 /label= Xaa /note= “T orA” Modified-site 718 /label= Xaa /note= “D or G” Modified-site 721/label= Xaa /note= “V or I” Modified-site 726 /label= Xaa /note= “H orR” Modified-site 731 /label= Xaa /note= “G or S” Modified-site 733/label= Xaa /note= “H or P” Modified-site 736 /label= Xaa /note= “E orQ” Modified-site 738 /label= Xaa /note= “V or E” Modified-site 742/label= Xaa /note= “I or L” Modified-site 750 /label= Xaa /note= “K orN” Modified-site 757 /label= Xaa /note= “E or K” Modified-site 760/label= Xaa /note= “M or N” 29 Met Arg Pro Xaa Xaa Leu Leu Xaa Leu XaaLeu Leu Leu Xaa Leu Pro 1 5 10 15 Arg Xaa Leu Gly Gly Xaa Gly Cys XaaSer Pro Pro Cys Glu Cys His 20 25 30 Gln Glu Xaa Asp Phe Arg Val Thr CysLys Asp Ile Xaa Arg Ile Pro 35 40 45 Xaa Leu Pro Pro Ser Thr Gln Thr LeuLys Xaa Ile Glu Thr Xaa Leu 50 55 60 Xaa Thr Ile Pro Ser Xaa Ala Phe SerAsn Leu Pro Asn Ile Ser Arg 65 70 75 80 Ile Tyr Xaa Ser Ile Asp Xaa ThrLeu Gln Xaa Leu Glu Ser His Ser 85 90 95 Phe Tyr Asn Leu Ser Lys Xaa ThrHis Ile Glu Ile Arg Asn Thr Arg 100 105 110 Xaa Leu Thr Xaa Ile Asp ProAsp Ala Leu Lys Glu Leu Pro Leu Leu 115 120 125 Lys Phe Leu Gly Ile PheAsn Thr Gly Leu Xaa Xaa Phe Pro Asp Xaa 130 135 140 Thr Lys Val Tyr SerThr Asp Xaa Phe Phe Ile Leu Glu Ile Thr Asp 145 150 155 160 Asn Pro TyrMet Xaa Ser Ile Pro Xaa Asn Ala Phe Gln Gly Leu Cys 165 170 175 Asn GluThr Leu Thr Leu Lys Leu Tyr Asn Asn Gly Phe Thr Ser Xaa 180 185 190 GlnGly Xaa Ala Phe Asn Gly Thr Lys Leu Asp Ala Val Tyr Leu Asn 195 200 205Lys Asn Lys Tyr Leu Xaa Xaa Ile Asp Lys Asp Ala Phe Gly Gly Val 210 215220 Tyr Ser Gly Pro Xaa Leu Leu Asp Val Ser Xaa Thr Ser Val Thr Ala 225230 235 240 Leu Pro Ser Lys Gly Leu Glu His Leu Lys Glu Leu Ile Ala ArgAsn 245 250 255 Thr Trp Thr Leu Lys Lys Leu Pro Leu Ser Leu Ser Phe LeuHis Leu 260 265 270 Thr Arg Ala Asp Leu Ser Tyr Pro Ser His Cys Cys AlaPhe Lys Asn 275 280 285 Gln Lys Lys Ile Arg Gly Ile Leu Glu Ser Leu MetCys Asn Glu Ser 290 295 300 Ser Xaa Xaa Ser Leu Arg Gln Arg Lys Ser ValAsn Xaa Leu Asn Xaa 305 310 315 320 Pro Xaa Xaa Gln Glu Tyr Glu Glu XaaLeu Gly Asp Ser Xaa Xaa Gly 325 330 335 Tyr Lys Xaa Xaa Ser Xaa Phe GlnAsp Thr Xaa Xaa Asn Xaa His Tyr 340 345 350 Tyr Val Phe Phe Glu Glu GlnGlu Asp Glu Ile Xaa Gly Phe Gly Gln 355 360 365 Glu Leu Lys Asn Pro GlnGlu Glu Thr Leu Gln Ala Phe Asp Ser His 370 375 380 Tyr Asp Tyr Thr XaaCys Gly Xaa Xaa Glu Asp Met Val Cys Thr Pro 385 390 395 400 Lys Ser AspGlu Phe Asn Pro Cys Glu Asp Ile Met Gly Tyr Lys Phe 405 410 415 Leu ArgIle Val Val Trp Phe Val Ser Leu Leu Ala Leu Leu Gly Asn 420 425 430 ValPhe Val Leu Xaa Xaa Leu Leu Thr Ser His Tyr Lys Leu Xaa Val 435 440 445Pro Arg Phe Leu Met Cys Asn Leu Ala Phe Ala Asp Phe Cys Xaa Gly 450 455460 Xaa Tyr Leu Leu Leu Ile Ala Ser Val Asp Xaa Xaa Thr Xaa Ser Xaa 465470 475 480 Tyr Xaa Asn Xaa Ala Ile Asp Trp Gln Thr Gly Xaa Gly Cys XaaXaa 485 490 495 Ala Gly Phe Phe Thr Val Phe Ala Ser Glu Leu Ser Val TyrThr Leu 500 505 510 Thr Val Ile Thr Leu Glu Arg Trp Xaa Xaa Ile Thr XaaAla Met Xaa 515 520 525 Leu Asp Xaa Lys Xaa Xaa Leu Arg His Ala Xaa XaaXaa Met Val Gly 530 535 540 Xaa Trp Xaa Xaa Xaa Phe Xaa Xaa Ala Leu XaaPro Xaa Xaa Gly Ile 545 550 555 560 Ser Ser Tyr Xaa Lys Val Ser Ile CysLeu Pro Met Asp Xaa Xaa Xaa 565 570 575 Pro Leu Xaa Xaa Xaa Tyr Xaa XaaXaa Xaa Leu Xaa Leu Asn Xaa Xaa 580 585 590 Ala Phe Xaa Ile Val Cys XaaCys Tyr Val Lys Ile Tyr Ile Thr Val 595 600 605 Arg Asn Pro Gln Tyr AsnPro Gly Asp Lys Asp Thr Lys Ile Ala Lys 610 615 620 Arg Met Ala Val LeuIle Phe Thr Asp Phe Xaa Cys Met Ala Pro Ile 625 630 635 640 Ser Phe TyrAla Leu Ser Ala Xaa Xaa Asn Lys Pro Leu Ile Thr Val 645 650 655 Xaa AsnSer Lys Ile Leu Leu Val Leu Phe Tyr Pro Leu Asn Ser Cys 660 665 670 AlaAsn Pro Phe Leu Tyr Ala Ile Phe Thr Lys Ala Phe Gln Arg Asp 675 680 685Val Phe Ile Leu Leu Ser Lys Phe Gly Ile Cys Lys Arg Gln Ala Gln 690 695700 Ala Tyr Arg Gly Gln Arg Val Xaa Pro Lys Asn Ser Xaa Xaa Ile Gln 705710 715 720 Xaa Gln Lys Val Thr Xaa Asp Met Arg Gln Xaa Leu Xaa Asn MetXaa 725 730 735 Asp Xaa Tyr Glu Leu Xaa Glu Asn Ser His Leu Thr Pro XaaLys Gln 740 745 750 Gly Gln Ile Ser Xaa Glu Tyr Xaa Gln Thr Val Leu 755760 28 base pairs nucleic acid single linear DNA (genomic) unknown 30TAGATCTAGA CYTGKCSNKB GCYGAYMT 28 28 base pairs nucleic acid singlelinear DNA (genomic) unknown 31 TAGATCTAGA CTTGTCCNGCG CTGAGAT 28 28base pairs nucleic acid single linear DNA (genomic) unknown 32TAGATCTAGA CCTGGCGNTGG CCGATCT 28 29 base pairs nucleic acid singlelinear DNA (genomic) unknown 33 ACTTTAAGCT TGCARTARMM SANRGGRTT 29 29base pairs nucleic acid single linear DNA (genomic) unknown 34ACTTTAAGCT TGCAATAAAA GANGGGGTT 29 57 amino acids amino acid singlelinear peptide unknown 35 Cys Asp Ala Ala Gly Phe Phe Thr Val Phe AlaSer Glu Leu Ser Val 1 5 10 15 Tyr Thr Leu Thr Ala Ile Thr Leu Glu ArgTrp His Thr Ile Thr His 20 25 30 Ala Met Gln Leu Asp Cys Lys Val Gln LeuArg His Ala Ala Ser Val 35 40 45 Met Val Met Gly Trp Ile Phe Ala Phe 5055 57 amino acids amino acid single linear peptide unknown 36 Cys AspLeu Trp Leu Ala Leu Asp Tyr Val Val Ser Asn Ala Ser Val 1 5 10 15 MetAsn Leu Leu Ile Ile Ser Phe Asp Arg Tyr Phe Cys Val Thr Lys 20 25 30 ProLeu Thr Tyr Pro Val Lys Arg Thr Thr Lys Met Ala Gly Met Met 35 40 45 IleAla Ala Ala Trp Val Leu Ser Phe 50 55 57 amino acids amino acid singlelinear peptide unknown 37 Cys Asp Leu Trp Leu Ala Leu Asp Tyr Val ValSer Asn Ala Ser Val 1 5 10 15 Met Asn Leu Leu Ile Ile Ser Phe Asp ArgTyr Phe Cys Val Thr Lys 20 25 30 Pro Leu Thr Tyr Pro Ala Arg Arg Thr ThrLys Met Ala Gly Leu Met 35 40 45 Ile Ala Ala Ala Trp Val Leu Ser Phe 5055 57 amino acids amino acid single linear peptide unknown 38 Cys AspLeu Trp Leu Ala Leu Asp Tyr Val Ala Ser Asn Ala Ser Val 1 5 10 15 MetAsn Leu Leu Leu Ile Ser Phe Asp Arg Tyr Phe Ser Val Thr Arg 20 25 30 ProLeu Ser Tyr Arg Ala Lys Arg Thr Pro Arg Arg Ala Ala Leu Met 35 40 45 IleGly Leu Ala Trp Leu Val Ser Phe 50 55 57 amino acids amino acid singlelinear peptide unknown 39 Cys Asp Leu Trp Leu Ala Ile Asp Tyr Val AlaSer Asn Ala Ser Val 1 5 10 15 Met Asn Leu Leu Val Ile Ser Phe Asp ArgTyr Phe Ser Ile Thr Arg 20 25 30 Pro Leu Thr Tyr Arg Ala Lys Arg Thr ThrLys Arg Ala Gly Val Met 35 40 45 Ile Gly Leu Ala Trp Val Ile Ser Phe 5055 57 amino acids amino acid single linear peptide unknown 40 Cys AspIle Trp Ala Ala Val Asp Val Leu Cys Cys Thr Ala Ser Ile 1 5 10 15 LeuSer Leu Cys Ala Ile Ser Ile Asp Arg Tyr Ile Gly Val Arg Tyr 20 25 30 SerLeu Gln Tyr Pro Thr Leu Val Thr Arg Arg Lys Ala Ile Leu Ala 35 40 45 LeuLeu Ser Val Trp Val Leu Ser Thr 50 55 58 amino acids amino acid singlelinear peptide unknown 41 Cys Asp Ile Phe Val Thr Leu Asp Val Met MetCys Thr Ala Ser Ile 1 5 10 15 Leu Asn Leu Cys Ala Ile Ser Ile Asp ArgTyr Thr Ala Val Ala Met 20 25 30 Pro Met Leu Tyr Asn Thr Arg Tyr Ser SerLys Arg Arg Val Thr Val 35 40 45 Met Ile Ala Ile Val Trp Val Leu Ser Phe50 55 57 amino acids amino acid single linear peptide unknown 42 Cys GluLeu Trp Thr Ser Val Asp Val Leu Cys Val Thr Ala Ser Ile 1 5 10 15 GluThr Leu Cys Val Ile Ala Leu Asp Arg Tyr Leu Ala Ile Thr Ser 20 25 30 ProPhe Arg Tyr Gln Ser Leu Leu Thr Arg Ala Arg Ala Arg Gly Leu 35 40 45 ValCys Thr Val Trp Ala Ile Ser Ala 50 55 57 amino acids amino acid singlelinear peptide unknown 43 Cys Glu Phe Trp Thr Ser Ile Asp Val Leu CysVal Thr Ala Ser Ile 1 5 10 15 Glu Thr Leu Cys Val Ile Ala Val Asp ArgTyr Phe Ala Ile Thr Ser 20 25 30 Pro Phe Lys Tyr Gln Ser Leu Leu Thr LysAsn Lys Ala Arg Val Ile 35 40 45 Ile Leu Met Val Trp Ile Val Ser Gly 5055 57 amino acids amino acid single linear peptide unknown 44 Cys GluPhe Trp Thr Ser Ile Asp Val Leu Cys Val Thr Ala Ser Ile 1 5 10 15 GluThr Leu Cys Val Ile Ala Val Asp Arg Tyr Ile Ala Ile Thr Ser 20 25 30 ProPhe Lys Tyr Gln Ser Leu Leu Thr Lys Asn Lys Ala Arg Met Val 35 40 45 IleLeu Met Val Trp Ile Val Ser Gly 50 55 57 amino acids amino acid singlelinear peptide unknown 45 Cys Asp Ile Trp Leu Ser Ser Asp Ile Thr CysCys Thr Ala Ser Ile 1 5 10 15 Leu His Leu Cys Val Ile Ala Leu Asp ArgTyr Trp Ala Ile Thr Asp 20 25 30 Ala Leu Glu Tyr Ser Lys Arg Arg Thr AlaGly Arg Ala Ala Val Met 35 40 45 Ile Ala Thr Val Trp Val Ile Ser Ile 5055 56 amino acids amino acid single linear peptide unknown 46 Cys GluIle Tyr Leu Ala Leu Asp Val Leu Phe Cys Thr Ser Ser Ile 1 5 10 15 ValHis Leu Cys Ala Ile Ser Leu Asp Arg Tyr Trp Ser Ile Thr Gln 20 25 30 AlaIle Glu Tyr Asn Leu Lys Arg Thr Arg Arg Arg Ile Lys Ala Ile 35 40 45 IleThr Cys Trp Val Ile Ser Ala 50 55 56 amino acids amino acid singlelinear peptide unknown 47 Cys Asp Leu Phe Ile Ala Leu Asp Val Leu CysCys Thr Ser Ser Ile 1 5 10 15 Leu His Leu Cys Ala Ile Ala Leu Asp ArgTyr Trp Ala Ile Thr Asp 20 25 30 Pro Ile Asp Tyr Val Asn Lys Arg Thr ProArg Pro Arg Ala Leu Ile 35 40 45 Ser Leu Thr Trp Leu Ile Gly Phe 50 5557 amino acids amino acid single linear peptide unknown 48 Cys Pro ValTrp Ile Ser Leu Asp Val Leu Phe Ser Thr Ala Ser Ile 1 5 10 15 Met HisLeu Cys Ala Ile Ser Leu Asp Arg Tyr Val Ala Ile Arg Asn 20 25 30 Pro IleGlu His Ser Arg Phe Asn Ser Arg Thr Lys Ala Ile Met Lys 35 40 45 Ile AlaIle Val Trp Ala Ile Ser Ile 50 55 57 amino acids amino acid singlelinear peptide unknown 49 Cys Ala Ile Trp Ile Tyr Leu Asp Val Leu PheSer Thr Ala Ser Ile 1 5 10 15 Met His Leu Cys Ala Ile Ser Leu Asp ArgTyr Val Ala Ile Gln Asn 20 25 30 Pro Ile His His Ser Arg Phe Asn Ser ArgThr Lys Ala Phe Leu Lys 35 40 45 Ile Ile Ala Val Trp Thr Ile Ser Val 5055 57 amino acids amino acid single linear peptide unknown 50 Cys LeuPhe Phe Ala Cys Phe Val Leu Val Leu Thr Gln Ser Ser Ile 1 5 10 15 PheSer Leu Leu Ala Ile Ala Ile Asp Arg Tyr Ile Ala Ile Arg Ile 20 25 30 ProLeu Arg Tyr Asn Gly Leu Val Thr Gly Thr Arg Ala Lys Gly Ile 35 40 45 IleAla Val Cys Trp Val Leu Ser Phe 50 55 57 amino acids amino acid singlelinear peptide unknown 51 Cys Leu Met Val Ala Cys Pro Val Leu Ile LeuThr Gln Ser Ser Ile 1 5 10 15 Leu Ala Leu Leu Ala Ile Ala Val Asp ArgTyr Leu Arg Val Lys Ile 20 25 30 Pro Leu Arg Tyr Lys Thr Val Val Thr ProArg Arg Ala Ala Val Ala 35 40 45 Ile Ala Gly Cys Trp Ile Leu Ser Phe 5055 55 amino acids amino acid single linear peptide unknown 52 Cys TyrPhe Gln Asn Leu Phe Pro Ile Thr Ala Met Phe Val Ser Ile 1 5 10 15 TyrSer Met Thr Ala Ile Ala Ala Asp Arg Tyr Met Ala Ile Val His 20 25 30 ProPhe Gln Pro Arg Leu Ser Ala Pro Gly Thr Arg Ala Val Ile Ala 35 40 45 GlyIle Trp Leu Val Ala Leu 50 55 57 amino acids amino acid single linearpeptide unknown 53 Cys Lys Ile Thr His Leu Ile Phe Ser Ile Asn Leu PheGly Ser Ile 1 5 10 15 Phe Phe Leu Thr Cys Met Ser Val Asp Arg Tyr LeuSer Ile Thr Tyr 20 25 30 Phe Ala Ser Thr Ser Ser Arg Arg Lys Lys Val ValArg Arg Ala Val 35 40 45 Cys Val Leu Val Trp Leu Leu Ala Phe 50 55 764amino acids amino acid single linear peptide unknown 54 Met Arg Pro ProPro Leu Leu His Leu Ala Leu Leu Leu Ala Leu Pro 1 5 10 15 Arg Ser LeuGly Gly Lys Gly Cys Pro Ser Pro Pro Cys Glu Cys His 20 25 30 Gln Glu AspAsp Phe Arg Val Thr Cys Lys Asp Ile His Arg Ile Pro 35 40 45 Thr Leu ProPro Ser Thr Gln Thr Leu Lys Phe Ile Glu Thr Gln Leu 50 55 60 Lys Thr IlePro Ser Arg Ala Phe Ser Asn Leu Pro Asn Ile Ser Arg 65 70 75 80 Ile TyrLeu Ser Ile Asp Ala Thr Leu Gln Arg Leu Glu Ser His Ser 85 90 95 Phe TyrAsn Leu Ser Lys Met Thr His Ile Glu Ile Arg Asn Thr Arg 100 105 110 SerLeu Thr Ser Ile Asp Pro Asp Ala Leu Lys Glu Leu Pro Leu Leu 115 120 125Lys Phe Leu Gly Ile Phe Asn Thr Gly Leu Gly Val Phe Pro Asp Val 130 135140 Thr Lys Val Tyr Ser Thr Asp Val Phe Phe Ile Leu Glu Ile Thr Asp 145150 155 160 Asn Pro Tyr Met Ala Ser Ile Pro Ala Asn Ala Phe Gln Gly LeuCys 165 170 175 Asn Glu Thr Leu Thr Leu Lys Leu Tyr Asn Asn Gly Phe ThrSer Ile 180 185 190 Gln Gly His Ala Phe Asn Gly Thr Lys Leu Asp Ala ValTyr Leu Asn 195 200 205 Lys Asn Lys Tyr Leu Ser Ala Ile Asp Lys Asp AlaPhe Gly Gly Val 210 215 220 Tyr Ser Gly Pro Thr Leu Leu Asp Val Ser TyrThr Ser Val Thr Ala 225 230 235 240 Leu Pro Ser Lys Gly Leu Glu His LeuLys Glu Leu Ile Ala Arg Asn 245 250 255 Thr Trp Thr Leu Lys Lys Leu ProLeu Ser Leu Ser Phe Leu His Leu 260 265 270 Thr Arg Ala Asp Leu Ser TyrPro Ser His Cys Cys Ala Phe Lys Asn 275 280 285 Gln Lys Lys Ile Arg GlyIle Leu Glu Ser Leu Met Cys Asn Glu Ser 290 295 300 Ser Ile Arg Ser LeuArg Gln Arg Lys Ser Val Asn Thr Leu Asn Gly 305 310 315 320 Pro Phe AspGln Glu Tyr Glu Glu Tyr Leu Gly Asp Ser His Ala Gly 325 330 335 Tyr LysAsp Asn Ser Gln Phe Gln Asp Thr Asp Ser Asn Ser His Tyr 340 345 350 TyrVal Phe Phe Glu Glu Gln Glu Asp Glu Ile Leu Gly Phe Gly Gln 355 360 365Glu Leu Lys Asn Pro Gln Glu Glu Thr Leu Gln Ala Phe Asp Ser His 370 375380 Tyr Asp Tyr Thr Val Cys Gly Gly Asn Glu Asp Met Val Cys Thr Pro 385390 395 400 Lys Ser Asp Glu Phe Asn Pro Cys Glu Asp Ile Met Gly Tyr LysPhe 405 410 415 Leu Arg Ile Val Val Trp Phe Val Ser Leu Leu Ala Leu LeuGly Asn 420 425 430 Val Phe Val Leu Ile Val Leu Leu Thr Ser His Tyr LysLeu Thr Val 435 440 445 Pro Arg Phe Leu Met Cys Asn Leu Ala Phe Ala AspPhe Cys Met Gly 450 455 460 Met Tyr Leu Leu Leu Ile Ala Ser Val Asp LeuTyr Thr His Ser Glu 465 470 475 480 Tyr Tyr Asn His Ala Ile Asp Trp GlnThr Gly Pro Gly Cys Asn Thr 485 490 495 Ala Gly Phe Phe Thr Val Phe AlaSer Glu Leu Ser Val Tyr Thr Leu 500 505 510 Thr Val Ile Thr Leu Glu ArgTrp Tyr Ala Ile Thr Phe Ala Met Arg 515 520 525 Leu Asp Arg Lys Ile ArgLeu Arg His Ala Tyr Ala Ile Met Val Gly 530 535 540 Gly Trp Val Cys CysPhe Leu Leu Ala Leu Leu Pro Leu Val Gly Ile 545 550 555 560 Ser Ser TyrAla Lys Val Ser Ile Cys Leu Pro Met Asp Thr Glu Thr 565 570 575 Pro LeuAla Leu Ala Tyr Ile Ile Leu Val Leu Leu Leu Asn Ile Val 580 585 590 AlaPhe Ile Ile Val Cys Ser Cys Tyr Val Lys Ile Tyr Ile Thr Val 595 600 605Arg Asn Pro Gln Tyr Asn Pro Gly Asp Lys Asp Thr Lys Ile Ala Lys 610 615620 Arg Met Ala Val Leu Ile Phe Thr Asp Phe Met Cys Met Ala Pro Ile 625630 635 640 Ser Phe Tyr Ala Leu Ser Ala Leu Met Asn Lys Pro Leu Ile ThrVal 645 650 655 Thr Asn Ser Lys Ile Leu Leu Val Leu Phe Tyr Pro Leu AsnSer Cys 660 665 670 Ala Asn Pro Phe Leu Tyr Ala Ile Phe Thr Lys Ala PheGln Arg Asp 675 680 685 Val Phe Ile Leu Leu Ser Lys Phe Gly Ile Cys LysArg Gln Ala Gln 690 695 700 Ala Tyr Arg Gly Gln Arg Val Ser Pro Lys AsnSer Ala Gly Ile Gln 705 710 715 720 Ile Gln Lys Val Thr Arg Asp Met ArgGln Ser Leu Pro Asn Met Gln 725 730 735 Asp Glu Tyr Glu Leu Leu Glu AsnSer His Leu Thr Pro Asn Lys Gln 740 745 750 Gly Gln Ile Ser Lys Glu TyrAsn Gln Thr Val Leu 755 760 795 amino acids amino acid single linearprotein unknown 55 Arg Ala Thr His Cys Gly Met Gly Arg Arg Val Pro AlaLeu Arg Gln 1 5 10 15 Leu Leu Val Leu Ala Val Leu Leu Leu Lys Pro SerGln Leu Gln Ser 20 25 30 Arg Glu Leu Ser Gly Ser Arg Cys Pro Glu Pro CysAsp Cys Ala Pro 35 40 45 Asp Gly Ala Leu Arg Ala Thr His Cys Gly Arg CysPro Gly Pro Arg 50 55 60 Ala Gly Leu Ala Arg Leu Ser Leu Thr Tyr Leu ProVal Lys Val Ile 65 70 75 80 Pro Ser Gln Ala Phe Arg Gly Leu Asn Glu ValVal Lys Ile Glu Ile 85 90 95 Ser Gln Ser Asp Ser Leu Glu Arg Ala Thr HisCys Gly Arg Ile Glu 100 105 110 Ala Asn Ala Phe Asp Asn Leu Leu Asn LeuSer Glu Leu Leu Ile Gln 115 120 125 Asn Thr Lys Asn Leu Leu Tyr Ile GluPro Gly Ala Phe Thr Asn Leu 130 135 140 Pro Arg Leu Lys Tyr Leu Ser IleCys Asn Thr Gly Ile Arg Thr Arg 145 150 155 160 Ala Thr His Cys Gly LeuPro Asp Val Thr Lys Ile Ser Ser Ser Glu 165 170 175 Phe Asn Phe Ile LeuGlu Ile Cys Asp Asn Leu His Ile Thr Thr Ile 180 185 190 Pro Gly Asn AlaPhe Gln Gly Met Asn Asn Glu Ser Val Thr Leu Lys 195 200 205 Leu Tyr GlyAsn Gly Phe Glu Arg Ala Thr His Cys Gly Glu Val Gln 210 215 220 Ser HisAla Phe Asn Gly Thr Thr Leu Ile Ser Leu Glu Leu Lys Glu 225 230 235 240Asn Ile Tyr Leu Glu Lys Met His Ser Gly Ala Phe Gln Gly Ala Thr 245 250255 Gly Pro Ser Ile Leu Asp Ile Ser Ser Thr Lys Leu Gln Ala Arg Ala 260265 270 Thr His Cys Gly Leu Pro Ser His Gly Leu Glu Ser Ile Gln Thr Leu275 280 285 Ile Ala Leu Ser Ser Tyr Ser Leu Lys Thr Leu Pro Ser Lys GluLys 290 295 300 Phe Thr Ser Leu Leu Val Ala Thr Leu Thr Tyr Pro Ser HisCys Cys 305 310 315 320 Ala Phe Arg Asn Leu Pro Arg Ala Thr His Cys GlyLys Lys Glu Gln 325 330 335 Asn Phe Ser Phe Ser Ile Phe Glu Asn Phe SerLys Gln Cys Glu Ser 340 345 350 Thr Val Arg Lys Ala Asp Asn Glu Thr LeuTyr Ser Ala Ile Phe Glu 355 360 365 Glu Asn Glu Leu Ser Gly Trp Asp ArgAla Thr His Cys Gly Tyr Asp 370 375 380 Tyr Gly Arg Ala Thr His Cys GlyPhe Ser Pro Lys Thr Leu Gln Cys 385 390 395 400 Ala Pro Glu Pro Asp AlaPhe Asn Pro Cys Glu Asp Ile Met Gly Tyr 405 410 415 Ala Phe Leu Arg ValLeu Ile Trp Leu Ile Asn Ile Leu Ala Ile Phe 420 425 430 Gly Asn Leu ThrVal Leu Phe Val Arg Ala Thr His Cys Gly Leu Leu 435 440 445 Thr Ser ArgTyr Lys Leu Thr Val Pro Arg Phe Leu Met Cys Asn Leu 450 455 460 Ser PheAla Asp Phe Cys Met Gly Leu Tyr Leu Leu Leu Ile Ala Ser 465 470 475 480Val Asp Ser Gln Thr Lys Gly Gln Tyr Tyr Asn His Ala Ile Asp Trp 485 490495 Arg Ala Thr His Cys Gly Gln Thr Gly Ser Gly Cys Gly Ala Ala Gly 500505 510 Phe Phe Thr Val Phe Ala Ser Glu Leu Ser Val Tyr Thr Leu Thr Val515 520 525 Ile Thr Leu Glu Arg Trp His Thr Ile Thr Tyr Ala Val Gln LeuAsp 530 535 540 Gln Lys Leu Arg Leu Arg His Ala Arg Ala Thr His Cys GlyIle Pro 545 550 555 560 Ile Met Leu Gly Gly Trp Leu Phe Ser Thr Leu IleAla Thr Met Pro 565 570 575 Leu Val Gly Ile Ser Asn Tyr Met Lys Val SerIle Cys Leu Pro Met 580 585 590 Asp Val Glu Ser Thr Leu Ser Gln Val TyrIle Leu Ser Ile Leu Ile 595 600 605 Arg Ala Thr His Cys Gly Leu Asn ValVal Ala Phe Val Val Ile Cys 610 615 620 Ala Cys Tyr Ile Arg Ile Tyr PheAla Val Gln Asn Pro Glu Leu Thr 625 630 635 640 Ala Pro Asn Lys Asp ThrLys Ile Ala Lys Lys Met Ala Ile Leu Ile 645 650 655 Phe Thr Asp Phe ThrCys Met Ala Arg Ala Thr His Cys Gly Pro Ile 660 665 670 Ser Phe Phe AlaIle Ser Ala Ala Phe Lys Val Pro Leu Ile Thr Val 675 680 685 Thr Asn SerLys Ile Leu Leu Val Leu Phe Tyr Pro Val Asn Ser Cys 690 695 700 Ala AsnPro Phe Leu Tyr Ala Ile Phe Thr Lys Ala Phe Gln Arg Asp 705 710 715 720Arg Ala Thr His Cys Gly Phe Leu Leu Leu Leu Ser Arg Phe Gly Cys 725 730735 Cys Lys Arg Arg Ala Glu Leu Tyr Arg Arg Lys Glu Phe Ser Ala Tyr 740745 750 Thr Ser Asn Cys Lys Asn Gly Phe Pro Gly Ala Ser Lys Pro Ser Gln755 760 765 Ala Thr Leu Lys Leu Ser Thr Val Arg Ala Thr His Cys Gly HisCys 770 775 780 Gln Gln Pro Ile Pro Pro Arg Ala Leu Thr His 785 790 795792 amino acids amino acid single linear protein unknown 56 Pro Ile GlyHis Cys Gly Met Arg Arg Arg Ser Leu Ala Leu Arg Leu 1 5 10 15 Leu LeuAla Leu Leu Leu Leu Pro Pro Pro Leu Pro Gln Thr Leu Leu 20 25 30 Gly AlaPro Cys Pro Glu Pro Cys Ser Cys Arg Pro Asp Gly Ala Leu 35 40 45 Pro IleGly His Cys Gly Arg Cys Pro Gly Pro Arg Ala Gly Leu Ser 50 55 60 Arg LeuSer Leu Thr Tyr Leu Pro Ile Lys Val Ile Pro Ser Gln Ala 65 70 75 80 PheArg Gly Leu Asn Glu Val Val Lys Ile Glu Ile Ser Gln Ser Asp 85 90 95 SerLeu Glu Pro Ile Gly His Cys Gly Lys Ile Glu Ala Asn Ala Phe 100 105 110Asp Asn Leu Leu Asn Leu Ser Glu Ile Leu Ile Gln Asn Thr Lys Asn 115 120125 Leu Val Tyr Ile Glu Pro Gly Ala Phe Thr Asn Leu Pro Arg Leu Lys 130135 140 Tyr Leu Ser Ile Cys Asn Thr Gly Ile Arg Lys Pro Ile Gly His Cys145 150 155 160 Gly Leu Pro Asp Val Thr Lys Ile Phe Ser Ser Glu Phe AsnPhe Ile 165 170 175 Leu Glu Ile Cys Asp Asn Leu His Ile Thr Thr Val ProAla Asn Ala 180 185 190 Phe Gln Gly Met Asn Asn Glu Ser Ile Thr Leu LysLeu Tyr Gly Asn 195 200 205 Gly Phe Glu Pro Ile Gly His Cys Gly Glu IleGln Ser His Ala Phe 210 215 220 Asn Gly Thr Leu Leu Ile Ser Leu Glu LeuLys Glu Asn Ala His Leu 225 230 235 240 Lys Lys Met His Asn Asp Ala PheArg Gly Ala Arg Gly Pro Ser Ile 245 250 255 Leu Asp Ile Ser Ser Thr LysLeu Gln Ala Pro Ile Gly His Cys Gly 260 265 270 Leu Pro Ser Tyr Gly LeuGlu Ser Ile Gln Thr Leu Ile Ala Thr Ser 275 280 285 Ser Tyr Ser Leu LysLys Leu Pro Ser Arg Glu Lys Phe Thr Asn Leu 290 295 300 Leu Asp Ala ThrLeu Thr Tyr Pro Ser His Cys Cys Ala Phe Arg Asn 305 310 315 320 Leu ProPro Ile Gly His Cys Gly Thr Lys Glu Gln Asn Phe Ser Phe 325 330 335 SerIle Phe Lys Asn Phe Ser Lys Gln Cys Glu Ser Thr Ala Arg Arg 340 345 350Pro Asn Asn Glu Thr Leu Tyr Ser Ala Ile Phe Ala Glu Ser Glu Leu 355 360365 Ser Asp Trp Asp Pro Ile Gly His Cys Gly Tyr Asp Tyr Gly Pro Ile 370375 380 Gly His Cys Gly Phe Cys Ser Pro Lys Thr Leu Gln Cys Ala Pro Glu385 390 395 400 Pro Asp Ala Phe Asn Pro Cys Glu Asp Ile Met Gly Tyr AspPhe Leu 405 410 415 Arg Val Leu Ile Trp Leu Ile Asn Ile Leu Ala Ile MetGly Asn Val 420 425 430 Thr Val Leu Phe Ala Pro Ile Gly His Cys Gly LeuLeu Thr Ser His 435 440 445 Tyr Lys Leu Thr Val Pro Arg Phe Leu Met CysAsn Leu Ser Phe Ala 450 455 460 Asp Phe Cys Met Gly Leu Tyr Leu Leu LeuIle Ala Ser Val Asp Ala 465 470 475 480 Gln Thr Lys Gly Gln Tyr Tyr AsnHis Ala Ile Asp Trp Pro Ile Gly 485 490 495 His Cys Gly Gln Thr Gly AsnGly Cys Ser Val Ala Gly Phe Phe Thr 500 505 510 Val Phe Ala Ser Glu LeuSer Val Tyr Thr Leu Thr Val Ile Thr Leu 515 520 525 Glu Arg Trp His ThrIle Thr Tyr Ala Ile Gln Leu Asp Gln Lys Leu 530 535 540 Arg Leu Arg HisAla Pro Ile Gly His Cys Gly Ile Pro Ile Met Leu 545 550 555 560 Gly GlyTrp Leu Phe Ser Thr Leu Ile Ala Met Leu Pro Leu Val Gly 565 570 575 ValSer Ser Tyr Met Lys Val Ser Ile Cys Leu Pro Met Asp Val Glu 580 585 590Thr Thr Leu Ser Gln Val Tyr Ile Leu Thr Ile Leu Ile Pro Ile Gly 595 600605 His Cys Gly Leu Asn Val Val Ala Phe Ile Ile Ile Cys Ala Cys Tyr 610615 620 Ile Lys Ile Tyr Phe Ala Val Gln Asn Pro Glu Leu Met Ala Thr Asn625 630 635 640 Lys Asp Thr Lys Ile Ala Lys Lys Met Ala Val Leu Ile PheThr Asp 645 650 655 Phe Thr Cys Met Ala Pro Ile Gly His Cys Gly Pro IleSer Phe Phe 660 665 670 Ala Ile Ser Ala Ala Leu Lys Val Pro Leu Ile ThrVal Thr Asn Ser 675 680 685 Lys Val Leu Leu Val Leu Phe Tyr Pro Val AsnSer Cys Ala Asn Pro 690 695 700 Phe Leu Tyr Ala Ile Phe Thr Lys Ala PheArg Arg Asp Pro Ile Gly 705 710 715 720 His Cys Gly Phe Phe Leu Leu LeuSer Lys Ser Gly Cys Cys Lys His 725 730 735 Gln Ala Glu Leu Tyr Arg ArgLys Asp Phe Ser Ala Tyr Cys Lys Asn 740 745 750 Gly Phe Thr Gly Ser AsnLys Pro Ser Arg Ser Thr Leu Lys Leu Thr 755 760 765 Thr Leu Pro Ile GlyHis Cys Gly Gln Cys Gln Tyr Ser Thr Val Met 770 775 780 Asp Lys Thr CysTyr Lys Asp Cys 785 790 4417 base pairs nucleic acid single linear cDNAunknown 57 CAGGCGCAGA GGGGCCCAGA CGACCGTGGA GGATGAAGAA ATAGCCTTGGGACCCTTGGA 60 AAATGAGGCC GCCGCCCCTG CTGCACCTGG CGCTGCTTCT CGCCCTGCCCAGGAGCCTGG 120 GGGGGAAGGG GTGTCCTTCT CCCCCCTGTG AGTGCCACCA GGAGGATGACTTCAGAGTCA 180 CCTGCAAGGA TATCCACCGC ATCCCCACCC TACCACCCAG CACGCAGACTCTGAAGTTTA 240 TAGAGACTCA GCTGAAAACC ATTCCCAGTC GTGCATTTTC AAATCTGCCCAATATTTCCA 300 GGATCTACTT GTCAATAGAT GCAACTCTGC AGCGGCTGGA ATCACATTCCTTCTACAATT 360 TAAGTAAAAT GACTCACATA GAGATTCGGA ATACCAGAAG CTTAACATCCATAGACCCTG 420 ACGCCCTAAA AGAGCTCCCA CTCCTGAAGT TCCTTGGCAT TTTCAACACTGGACTTGGAG 480 TATTCCCTGA TGTGACCAAA GTTTATTCCA CTGATGTATT CTTTATACTTGAAATCACAG 540 ACAACCCTTA CATGGCTTCC ATCCCTGCCA ATGCTTTCCA GGGGCTGTGCAATGAAACCC 600 TGACACTGAA ACTATACAAC AATGGCTTTA CTTCAATCCA AGGACATGCTTTCAATGGGA 660 CAAAACTGGA TGCTGTTTAC CTGAACAAGA ATAAATACCT GTCAGCTATCGACAAAGATG 720 CATTTGGAGG AGTGTACAGT GGACCAACCT TGCTGGATGT CTCTTACACCAGTGTTACTG 780 CCCTGCCATC CAAAGGCCTG GAGCATCTAA AGGAGCTGAT AGCAAGAAACACTTGGACTC 840 TAAAGAAACT CCCACTTTCC TTGAGTTTCC TTCACCTTAC ACGGGCTGACCTTTCTTATC 900 CAAGCCACTG CTGTGCTTTT AAGAATCAGA AGAAAATCAG AGGAATCCTTGAGTCCTTAA 960 TGTGTAATGA AAGCAGTATT CGGAGCCTGC GCCAGAGAAA ATCTGTGAATACTTTGAATG 1020 GCCCCTTTGA CCAGGAATAT GAAGAGTATC TGGGTGACAG CCATGCTGGGTACAAGGACA 1080 ACTCTCAGTT CCAGGATACC GATAGCAATT CTCATTATTA TGTCTTCTTCGAAGAACAAG 1140 AAGATGAGAT CCTCGGTTTT GGGCAGGAGC TTAAAAACCC ACAGGAAGAGACCCTCCAGG 1200 CCTTTGATAG CCATTATGAC TACACTGTGT GTGGTGGCAA TGAAGACATGGTGTGTACTC 1260 CTAAGTCAGA TGAGTTCAAC CCCTGTGAAG ACATAATGGG CTACAAGTTCCTGAGGATTG 1320 TGGTGTGGTT TGTTAGTCTG CTGGCTCTCC TGGGCAATGT CTTTGTCCTGATCGTCCTCC 1380 TTACCAGTCA CTACAAATTG ACTGTCCCAC GCTTTCTCAT GTGCAACTTGGCCTTTGCAG 1440 ATTTCTGCAT GGGGATGTAT CTGCTCCTCA TCGCCTCCGT AGACCTCTACACTCATTCTG 1500 AGTACTACAA CCATGCCATC GACTGGCAGA CAGGCCCTGG GTGTAACACAGCTGGTTTCT 1560 TCACTGTCTT TGCCAGTGAA TTATCAGTGT ATACACTGAC AGTCATCACCCTGGAGCGCT 1620 GGTATGCCAT TACCTTCGCC ATGCGCCTGG ACAGGAAGAT CCGCCTCAGGCATGCATATG 1680 CCATCATGGT TGGGGGCTGG GTTTGCTGCT TCCTGCTCGC CCTGCTCCCTCTGGTGGGAA 1740 TAAGCAGCTA TGCCAAGGTC AGCATCTGCC TGCCCATGGA CACTGAGACACCTCTTGCCC 1800 TGGCATATAT TATCCTTGTT CTGTTGCTCA ACATAGTTGC CTTTATCATTGTCTGCTCCT 1860 GTTATGTGAA GATCTACATC ACAGTCCGAA ATCCCCAGTA CAACCCGGGGGACAAAGACA 1920 CCAAAATTGC CAAAAGGATG GCTGTATTGA TCTTCACTGA CTTCATGTGCATGGCCCCAA 1980 TCTCATTCTA CGCTCTGTCA GCACTTATGA ACAAGCCTCT CATCACTGTTACCAACTCCA 2040 AAATCTTGCT GGTTCTCTTC TATCCACTTA ACTCCTGTGC CAATCCATTTCTCTATGCTA 2100 TTTTCACGAA AGCCTTCCAG AGGGATGTAT TTATCCTGCT CAGCAAGTTTGGCATCTGTA 2160 AACGCCAGGC TCAGGCATAC CGGGGCCAGA GGGTTTCTCC AAAGAATAGTGCTGGTATTC 2220 AGATCCAAAA GGTTACCCGG GACATGAGGC AAAGTCTCCC CAACATGCAGGATGAGTATG 2280 AACTGCTTGA AAACTCCCAT CTAACCCCAA ATAAGCAGGG CCAAATCTCAAAAGAGTATA 2340 ACCAAACAGT TCTGTAAGCA GACCCTATAC TACTCGCAGT GGCAGGTGGACTTCTAAAAA 2400 TCTAGTTTCT TGAACACGTA TTCCAAATTC ATTATATACA CAAGACAGCTGACCTAACCC 2460 TTTGCAGGTG ATGTTTCATG GGGCAAATTT CATCTCCAAA AAGGGGGTAGCTCTACCACC 2520 TAATCATTAC CTCCCAGAAG GAAGAGAGGC TACCAGCACT TCTGAACCCTGGTGATATCA 2580 AGATAACTGA CACTTTCTAG AAAACTTGTT TGATGCTAAC TGCTTTAACAACATTGTATA 2640 AGATGTCCAA CAGATATTAA CTGAACCAGG TCAACATTGA GCTTCTCACTTTCAAATAGC 2700 ATTTCATAGT AAAGATTCTG CAAATGGCAA ATGCTATTAA CTGAGTTGGTGACCACAAGA 2760 TAGAATTAGC CCCATGTTGG CTTGGTCCAC CTTCATGTTC TTGGATACAACCAAAGAGAA 2820 TGTGAATTCC TCGAAACTGA AAAGTCCAGC AGGATACATG CATGAAGCAGCTATTATGAG 2880 GTGGAAGGAG GGGAAAGGCT TAGCTTAGTT GTTATTTCAG CCTCTGAAACTATATCATCT 2940 CTTCACAAGG ACCTACCTGA TGTGACCCAA CTGTTAGGTG TTGCCCAGGGGGGAAAAAAA 3000 CTGGCAAGAT TTCAGCTTAT GTGGCTGAGC AAAGTAAGAA TTGTTCTTCTTGGCTAGTCT 3060 TATAGCATAA AATACGTGAA CCCTAGAAAT ATTTCTAAGT AGCAGCAAGTGGGAATTATG 3120 AGCAGGGCAC ACTAAATCAC ACACTGATTA ATAAAGCAGG GCCACAAGGTAACTGTTGGA 3180 GCTTGGGCAA ATCACTGGGC CACTTCTAAG TCTAGAAATG AGAGAGCCTGATTGCTTCTT 3240 CAGTTTCAAA ACTCTATGTA TATCCCTTCC CCTTAAAATA TGTTTCCATGACAAAAAAGA 3300 AAAGCACTAA AAAAGAAAAG AAAAGAAAAG CACTAAGAAA GAAAAATTTATTTTTCCTAT 3360 CTTGTAGTGC AGCCACCTCT TTCTCTTTGG AGGCTGGATA TATGACCCAGGACATTTCTT 3420 TCTTTTTTTT ATTTTTTTTT TCATTTTTGA TTATAATGTC TGATCCATGTTGGGCTGGAT 3480 CTAAATCACT CAACTAATTA CTAGATCTCT ACAGCTACAA TTATCAGGCCAAAAACAGAC 3540 TCATATTCAC ATAACAGAAT AAAAGGTGGT TTTGCAAATT TTGGTTATTCAGAGTTACTA 3600 CTTCACTGTA TAGATTAACT TGAAAACATT TAACTTGTCC AGGGATTGGAAGCTATCAAA 3660 CACTCAGGCA AAGCAACACT AAAGCTATCA AGAGAAGTTT CTTCTCTCCAAAACTGCTAG 3720 CCTTTTCCAA CCTGTTGATC ATTGGACATA ATCTCTATTG CCCAATAGTGTTCTCTTACT 3780 TAAAATGGTT AGGATCAATC TTTTAATATA GACGTACTCT TCAGATTACCTGTCAAAACA 3840 GTCCCTTAAT TTCCTCCCAA GCAGAGATGG CATTTGCTTC TCAATGTTCATGAAGCACAC 3900 CAAGGAATTA GAAGCATTTG TTGTTTCAAG TCTGTGGAGT AGGGTTACTGGGCCCAATGC 3960 CCCCCCCCCC ACAGAGATGG TCCCCCAACC CACCTAGGAT ATCCCAATAGCAATACCCAT 4020 TTCTGATTAT CATTGAGATT GGACATCTTA GTAGAAATAT TATACACACTCGAAATCATG 4080 ACTTATCCAC CAGTTCACTT GTAACTAATA ACTAAACAGT TGTGTTATCGTTTGGCATGT 4140 GTTTCTCACC TGTGACATTT TGAAATAGTA CATCCTGATA ATGTATTTTATCTTAAGTAG 4200 TTGAAATAAC ACTTTGGAAA CCGTCCTAGA AAAGTAACTT CAACACAATTGTTACTAAAA 4260 TTTGCATTCA CAACATGAAA TAAATTTTCT TCCTATGAAA TGATTGTGCTGAGTCCTACA 4320 GTATGGCATT TTGTAATTTG TGAGCTTCTT TTAATGTTAC CGTTATATGTGTTACAACTG 4380 AAGACAGGGA AAAAAAAACA ACTGGCAAAT TTGCTAA 4417 277 aminoacids amino acid single linear protein unknown 58 Met Arg Pro Pro ProLeu Leu His Leu Ala Leu Leu Leu Ala Leu Pro 1 5 10 15 Arg Ser Leu GlyGly Lys Gly Cys Pro Ser Pro Pro Cys Glu Cys His 20 25 30 Gln Glu Asp AspPhe Arg Val Thr Cys Lys Asp Ile His Arg Ile Pro 35 40 45 Thr Leu Pro ProSer Thr Gln Thr Leu Lys Phe Ile Glu Thr Gln Leu 50 55 60 Lys Thr Ile ProSer Arg Ala Phe Ser Asn Leu Pro Asn Ile Ser Arg 65 70 75 80 Ile Tyr LeuSer Ile Asp Ala Thr Leu Gln Arg Leu Glu Ser His Ser 85 90 95 Phe Tyr AsnLeu Ser Lys Met Thr His Ile Glu Ile Arg Asn Thr Arg 100 105 110 Ser LeuThr Ser Ile Asp Pro Asp Ala Leu Lys Glu Leu Pro Leu Leu 115 120 125 LysPhe Leu Gly Ile Phe Asn Thr Gly Leu Gly Val Phe Pro Asp Val 130 135 140Thr Lys Val Tyr Ser Thr Asp Val Phe Phe Ile Leu Glu Ile Thr Asp 145 150155 160 Asn Pro Tyr Met Ala Ser Ile Pro Ala Asn Ala Phe Gln Gly Leu Cys165 170 175 Asn Glu Thr Leu Thr Leu Lys Leu Tyr Asn Asn Gly Phe Thr SerIle 180 185 190 Gln Gly His Ala Phe Asn Gly Thr Lys Leu Asp Ala Val TyrLeu Asn 195 200 205 Lys Asn Lys Tyr Leu Ser Ala Ile Asp Lys Asp Ala PheGly Gly Val 210 215 220 Tyr Ser Gly Pro Thr Leu Leu Asp Val Ser Tyr ThrSer Val Thr Ala 225 230 235 240 Leu Pro Ser Lys Gly Leu Glu His Leu LysGlu Leu Ile Ala Arg Asn 245 250 255 Thr Trp Thr Leu Lys Lys Leu Pro LeuSer Leu Ser Phe Leu His Leu 260 265 270 Thr Arg Ala Asp Leu 275 764amino acids amino acid single linear protein unknown 59 Met Arg Pro AlaAsp Leu Leu Gln Leu Val Leu Leu Leu Asp Leu Pro 1 5 10 15 Arg Asp LeuGly Gly Met Gly Cys Ser Ser Pro Pro Cys Glu Cys His 20 25 30 Gln Glu GluAsp Phe Arg Val Thr Cys Lys Asp Ile Gln Arg Ile Pro 35 40 45 Ser Leu ProPro Ser Thr Gln Thr Leu Lys Leu Ile Glu Thr His Leu 50 55 60 Arg Thr IlePro Ser His Ala Phe Ser Asn Leu Pro Asn Ile Ser Arg 65 70 75 80 Ile TyrVal Ser Ile Asp Leu Thr Leu Gln Gln Leu Glu Ser His Ser 85 90 95 Phe TyrAsn Leu Ser Lys Val Thr His Ile Glu Ile Arg Asn Thr Arg 100 105 110 AsnLeu Thr Tyr Ile Asp Pro Asp Ala Leu Lys Glu Leu Pro Leu Leu 115 120 125Lys Phe Leu Gly Ile Phe Asn Thr Gly Leu Lys Met Phe Pro Asp Leu 130 135140 Thr Lys Val Tyr Ser Thr Asp Ile Phe Phe Ile Leu Glu Ile Thr Asp 145150 155 160 Asn Pro Tyr Met Thr Ser Ile Pro Val Asn Ala Phe Gln Gly LeuCys 165 170 175 Asn Glu Thr Leu Thr Leu Lys Leu Tyr Asn Asn Gly Phe ThrSer Val 180 185 190 Gln Gly Tyr Ala Phe Asn Gly Thr Lys Leu Asp Ala ValTyr Leu Asn 195 200 205 Lys Asn Lys Tyr Leu Thr Val Ile Asp Lys Asp AlaPhe Gly Gly Val 210 215 220 Tyr Ser Gly Pro Ser Leu Leu Asp Val Ser GlnThr Ser Val Thr Ala 225 230 235 240 Leu Pro Ser Lys Gly Leu Glu His LeuLys Glu Leu Ile Ala Arg Asn 245 250 255 Thr Trp Thr Leu Lys Lys Leu ProLeu Ser Leu Ser Phe Leu His Leu 260 265 270 Thr Arg Ala Asp Leu Ser TyrPro Ser His Cys Cys Ala Phe Lys Asn 275 280 285 Gln Lys Lys Ile Arg GlyIle Leu Glu Ser Leu Met Cys Asn Glu Ser 290 295 300 Ser Met Gln Ser LeuArg Gln Arg Lys Ser Val Asn Ala Leu Asn Ser 305 310 315 320 Pro Leu HisGln Glu Tyr Glu Glu Asn Leu Gly Asp Ser Ile Val Gly 325 330 335 Tyr LysGlu Lys Ser Lys Phe Gln Asp Thr His Asn Asn Ala His Tyr 340 345 350 TyrVal Phe Phe Glu Glu Gln Glu Asp Glu Ile Ile Gly Phe Gly Gln 355 360 365Glu Leu Lys Asn Pro Gln Glu Glu Thr Leu Gln Ala Phe Asp Ser His 370 375380 Tyr Asp Tyr Thr Ile Cys Gly Asp Ser Glu Asp Met Val Cys Thr Pro 385390 395 400 Lys Ser Asp Glu Phe Asn Pro Cys Glu Asp Ile Met Gly Tyr LysPhe 405 410 415 Leu Arg Ile Val Val Trp Phe Val Ser Leu Leu Ala Leu LeuGly Asn 420 425 430 Val Phe Val Leu Leu Ile Leu Leu Thr Ser His Tyr LysLeu Asn Val 435 440 445 Pro Arg Phe Leu Met Cys Asn Leu Ala Phe Ala AspPhe Cys Met Gly 450 455 460 Met Tyr Leu Leu Leu Ile Ala Ser Val Asp LeuTyr Thr His Ser Glu 465 470 475 480 Tyr Tyr Asn His Ala Ile Asp Trp GlnThr Gly Pro Gly Cys Asn Thr 485 490 495 Ala Gly Phe Phe Thr Val Phe AlaSer Glu Leu Ser Val Tyr Thr Leu 500 505 510 Thr Val Ile Thr Leu Glu ArgTrp Tyr Ala Ile Thr Phe Ala Met Arg 515 520 525 Leu Asp Arg Lys Ile ArgLeu Arg His Ala Cys Ala Ile Met Val Gly 530 535 540 Gly Trp Val Cys CysPhe Leu Leu Ala Leu Leu Pro Leu Val Gly Ile 545 550 555 560 Ser Ser TyrAla Lys Val Ser Ile Cys Leu Pro Met Asp Thr Glu Thr 565 570 575 Pro LeuAla Leu Ala Tyr Ile Val Phe Val Leu Thr Leu Asn Ile Val 580 585 590 AlaPhe Val Ile Val Cys Cys Cys Tyr Val Lys Ile Tyr Ile Thr Val 595 600 605Arg Asn Pro Gln Tyr Asn Pro Gly Asp Lys Asp Thr Lys Ile Ala Lys 610 615620 Arg Met Ala Val Leu Ile Phe Thr Asp Phe Ile Cys Met Ala Pro Ile 625630 635 640 Ser Phe Tyr Ala Leu Ser Ala Ile Leu Asn Lys Pro Leu Ile ThrVal 645 650 655 Ser Asn Ser Lys Ile Leu Leu Val Leu Phe Tyr Pro Leu AsnSer Cys 660 665 670 Ala Asn Pro Phe Leu Tyr Ala Ile Phe Thr Lys Ala PheGln Arg Asp 675 680 685 Val Phe Ile Leu Leu Ser Lys Phe Gly Ile Cys LysArg Gln Ala Gln 690 695 700 Ala Tyr Arg Gly Gln Arg Val Pro Pro Lys AsnSer Thr Asp Ile Gln 705 710 715 720 Val Gln Lys Val Thr His Asp Met ArgGln Gly Leu His Asn Met Glu 725 730 735 Asp Val Tyr Glu Leu Ile Glu AsnSer His Leu Thr Pro Lys Lys Gln 740 745 750 Gly Gln Ile Ser Glu Glu TyrMet Gln Thr Val Leu 755 760 764 amino acids amino acid single linearprotein unknown 60 Met Arg Pro Pro Pro Leu Leu His Leu Ala Leu Leu LeuAla Leu Pro 1 5 10 15 Arg Ser Leu Gly Gly Lys Gly Cys Pro Ser Pro ProCys Glu Cys His 20 25 30 Gln Glu Asp Glu Phe Arg Val Thr Cys Lys Asp IleHis Arg Ile Pro 35 40 45 Thr Leu Pro Pro Ser Thr Gln Thr Leu Lys Phe IleGlu Thr Gln Leu 50 55 60 Lys Thr Ile Pro Ser Arg Ala Phe Ser Asn Leu ProAsn Ile Ser Arg 65 70 75 80 Ile Tyr Leu Ser Ile Asp Ala Thr Leu Gln ArgLeu Glu Ser His Ser 85 90 95 Phe Tyr Asn Leu Ser Lys Met Thr His Ile GluIle Arg Asn Thr Arg 100 105 110 Ser Leu Thr Ser Ile Asp Pro Asp Ala LeuLys Glu Leu Pro Leu Leu 115 120 125 Lys Phe Leu Gly Ile Phe Asn Thr GlyLeu Gly Val Phe Pro Asp Val 130 135 140 Thr Lys Val Tyr Ser Thr Asp ValPhe Phe Ile Leu Glu Ile Thr Asp 145 150 155 160 Asn Pro Tyr Met Ala SerIle Pro Ala Asn Ala Phe Gln Gly Leu Cys 165 170 175 Asn Glu Thr Leu ThrLeu Lys Leu Tyr Asn Asn Gly Phe Thr Ser Ile 180 185 190 Gln Gly His AlaPhe Asn Gly Thr Lys Leu Asp Ala Val Tyr Leu Asn 195 200 205 Lys Asn LysTyr Leu Ser Ala Ile Asp Lys Asp Ala Phe Gly Gly Val 210 215 220 Tyr SerGly Pro Thr Leu Leu Asp Val Ser Tyr Thr Ser Val Thr Ala 225 230 235 240Leu Pro Ser Lys Gly Leu Glu His Leu Lys Glu Leu Ile Ala Arg Asn 245 250255 Thr Trp Thr Leu Lys Lys Leu Pro Leu Ser Leu Ser Phe Leu His Leu 260265 270 Thr Arg Ala Asp Leu Ser Tyr Pro Ser His Cys Cys Ala Phe Lys Asn275 280 285 Gln Lys Lys Ile Arg Gly Ile Leu Glu Ser Leu Met Cys Asn GluSer 290 295 300 Ser Ile Arg Ser Leu Arg Gln Arg Lys Ser Val Asn Thr LeuAsn Gly 305 310 315 320 Pro Phe Asp Gln Glu Tyr Glu Glu Tyr Leu Gly AspSer His Ala Gly 325 330 335 Tyr Lys Asp Asn Ser Gln Phe Gln Asp Thr AspSer Asn Ser His Tyr 340 345 350 Tyr Val Phe Phe Glu Glu Gln Glu Asp GluIle Leu Gly Phe Gly Gln 355 360 365 Glu Leu Lys Asn Pro Gln Glu Glu ThrLeu Gln Ala Phe Asp Ser His 370 375 380 Tyr Asp Tyr Thr Val Cys Gly GlyAsn Glu Asp Met Val Cys Thr Pro 385 390 395 400 Lys Ser Asp Glu Phe AsnPro Cys Glu Asp Ile Met Gly Tyr Lys Phe 405 410 415 Leu Arg Ile Val ValTrp Phe Val Ser Leu Leu Ala Leu Leu Gly Asn 420 425 430 Val Phe Val LeuIle Val Leu Leu Thr Ser His Tyr Lys Leu Thr Val 435 440 445 Pro Arg PheLeu Met Cys Asn Leu Ala Phe Ala Asp Phe Cys Ile Gly 450 455 460 Ile TyrLeu Leu Leu Ile Ala Ser Val Asp Ile His Thr Lys Ser Gln 465 470 475 480Tyr His Asn Tyr Ala Ile Asp Trp Gln Thr Gly Ala Gly Cys Asp Ala 485 490495 Ala Gly Phe Phe Thr Val Phe Ala Ser Glu Leu Ser Val Tyr Thr Leu 500505 510 Thr Val Ile Thr Leu Glu Arg Trp His Thr Ile Thr His Ala Met Gln515 520 525 Leu Asp Cys Lys Val Gln Leu Arg His Ala Tyr Ser Ala Met ValGly 530 535 540 Met Trp Ile Phe Ala Phe Ala Ala Ala Leu Phe Pro Ile PheGly Ile 545 550 555 560 Ser Ser Tyr Met Lys Val Ser Ile Cys Leu Pro MetAsp Ile Asp Ser 565 570 575 Pro Leu Ser Leu Gln Tyr Val Ile Leu Leu LeuLeu Leu Asn Val Leu 580 585 590 Ala Phe Ile Ile Val Cys Ser Cys Tyr ValLys Ile Tyr Ile Thr Val 595 600 605 Arg Asn Pro Gln Tyr Asn Pro Gly AspLys Asp Thr Lys Ile Ala Lys 610 615 620 Arg Met Ala Val Leu Ile Phe ThrAsp Phe Met Cys Met Ala Pro Ile 625 630 635 640 Ser Phe Tyr Ala Leu SerAla Leu Met Asn Lys Pro Leu Ile Thr Val 645 650 655 Thr Asn Ser Lys IleLeu Leu Val Leu Phe Tyr Pro Leu Asn Ser Cys 660 665 670 Ala Asn Pro PheLeu Tyr Ala Ile Phe Thr Lys Ala Phe Gln Arg Asp 675 680 685 Val Phe IleLeu Leu Ser Lys Phe Gly Ile Cys Lys Arg Gln Ala Gln 690 695 700 Ala TyrArg Gly Gln Arg Val Ser Pro Lys Asn Ser Ala Gly Ile Gln 705 710 715 720Ile Gln Lys Val Thr Arg Asp Met Arg Gln Ser Leu Pro Asn Met Gln 725 730735 Asp Glu Tyr Glu Leu Leu Glu Asn Ser His Leu Thr Pro Asn Lys Gln 740745 750 Gly Gln Ile Ser Lys Glu Tyr Asn Gln Thr Val Leu 755 760 764amino acids amino acid single linear protein unknown 61 Met Arg Pro AlaAsp Leu Leu Gln Leu Val Leu Leu Leu Asp Leu Pro 1 5 10 15 Arg Asp LeuGly Gly Met Gly Cys Ser Ser Pro Pro Cys Glu Cys His 20 25 30 Gln Glu GluAsp Phe Arg Val Thr Cys Lys Asp Ile Gln Arg Ile Pro 35 40 45 Ser Leu ProPro Ser Thr Gln Thr Leu Lys Leu Ile Glu Thr His Leu 50 55 60 Arg Thr IlePro Ser His Ala Phe Ser Asn Leu Pro Asn Ile Ser Arg 65 70 75 80 Ile TyrVal Ser Ile Asp Leu Thr Leu Gln Gln Leu Glu Ser His Ser 85 90 95 Phe TyrAsn Leu Ser Lys Val Thr His Ile Glu Ile Arg Asn Thr Arg 100 105 110 AsnLeu Thr Tyr Ile Asp Pro Asp Ala Leu Lys Glu Leu Pro Leu Leu 115 120 125Lys Phe Leu Gly Ile Phe Asn Thr Gly Leu Lys Met Phe Pro Asp Leu 130 135140 Thr Lys Val Tyr Ser Thr Asp Ile Phe Phe Ile Leu Glu Ile Thr Asp 145150 155 160 Asn Pro Tyr Met Thr Ser Ile Pro Val Asn Ala Phe Gln Gly LeuCys 165 170 175 Asn Glu Thr Leu Thr Leu Lys Leu Tyr Asn Asn Gly Phe ThrSer Val 180 185 190 Gln Gly Tyr Ala Phe Asn Gly Thr Lys Leu Asp Ala ValTyr Leu Asn 195 200 205 Lys Asn Lys Tyr Leu Thr Val Ile Asp Lys Asp AlaPhe Gly Gly Val 210 215 220 Tyr Ser Gly Pro Ser Leu Leu Asp Val Ser GlnThr Ser Val Thr Ala 225 230 235 240 Leu Pro Ser Lys Gly Leu Glu His LeuLys Glu Leu Ile Ala Arg Asn 245 250 255 Thr Trp Thr Leu Lys Lys Leu ProLeu Ser Leu Ser Phe Leu His Leu 260 265 270 Thr Arg Ala Asp Leu Ser TyrPro Ser His Cys Cys Ala Phe Lys Asn 275 280 285 Gln Lys Lys Ile Arg GlyIle Leu Glu Ser Leu Met Cys Asn Glu Ser 290 295 300 Ser Met Gln Ser LeuArg Gln Arg Lys Ser Val Asn Ala Leu Asn Ser 305 310 315 320 Pro Leu HisGln Glu Tyr Glu Glu Asn Leu Gly Asp Ser Ile Val Gly 325 330 335 Tyr LysGlu Lys Ser Lys Phe Gln Asp Thr His Asn Asn Ala His Tyr 340 345 350 TyrVal Phe Phe Glu Glu Gln Glu Asp Glu Ile Ile Gly Phe Gly Gln 355 360 365Glu Leu Lys Asn Pro Gln Glu Glu Thr Leu Gln Ala Phe Asp Ser His 370 375380 Tyr Asp Tyr Thr Ile Cys Gly Asp Ser Glu Asp Met Val Cys Thr Pro 385390 395 400 Lys Ser Asp Glu Phe Asn Pro Cys Glu Asp Ile Met Gly Tyr LysPhe 405 410 415 Leu Arg Ile Val Val Trp Phe Val Ser Leu Leu Ala Leu LeuGly Asn 420 425 430 Val Phe Val Leu Leu Ile Leu Leu Thr Ser His Tyr LysLeu Asn Val 435 440 445 Pro Arg Phe Leu Met Cys Asn Leu Ala Phe Ala AspPhe Cys Met Gly 450 455 460 Met Tyr Leu Leu Leu Ile Ala Ser Val Asp LeuTyr Thr His Ser Glu 465 470 475 480 Tyr Tyr Asn His Ala Ile Asp Trp GlnThr Gly Pro Gly Cys Asn Thr 485 490 495 Ala Gly Phe Phe Thr Val Phe AlaSer Glu Leu Ser Val Tyr Thr Leu 500 505 510 Thr Val Ile Thr Leu Glu ArgTrp Tyr Ala Ile Thr Phe Ala Met Arg 515 520 525 Leu Asp Arg Lys Ile ArgLeu Arg His Ala Ala Ala Ile Met Val Gly 530 535 540 Gly Trp Val Cys CysPhe Leu Leu Ala Leu Leu Pro Leu Val Gly Ile 545 550 555 560 Ser Ser TyrAla Lys Val Ser Ile Cys Leu Pro Met Asp Thr Glu Thr 565 570 575 Pro LeuAla Leu Ala Tyr Ile Met Ser Val Leu Val Leu Asn Ile Val 580 585 590 AlaPhe Val Ile Val Cys Cys Cys Tyr Val Lys Ile Tyr Ile Thr Val 595 600 605Arg Asn Pro Gln Tyr Asn Pro Gly Asp Lys Asp Thr Lys Ile Ala Lys 610 615620 Arg Met Ala Val Leu Ile Phe Thr Asp Phe Ile Cys Met Ala Pro Ile 625630 635 640 Ser Phe Tyr Ala Leu Ser Ala Ile Leu Asn Lys Pro Leu Ile ThrVal 645 650 655 Ser Asn Ser Lys Ile Leu Leu Val Leu Phe Tyr Pro Leu AsnSer Cys 660 665 670 Ala Asn Pro Phe Leu Tyr Ala Ile Phe Thr Lys Ala PheGln Arg Asp 675 680 685 Val Phe Ile Leu Leu Ser Lys Phe Gly Ile Cys LysArg Gln Ala Gln 690 695 700 Ala Tyr Arg Gly Gln Arg Val Pro Pro Lys AsnSer Thr Asp Ile Gln 705 710 715 720 Val Gln Lys Val Thr His Asp Met ArgGln Gly Leu His Asn Met Glu 725 730 735 Asp Val Tyr Glu Leu Ile Glu AsnSer His Leu Thr Pro Lys Lys Gln 740 745 750 Gly Gln Ile Ser Glu Glu TyrMet Gln Thr Val Leu 755 760 3710 base pairs nucleic acid single linearcDNA unknown 62 AGGCAGCAGT TTCCTCCTGG GACCTGATGG CTCCCAGATC ACTATCTTGGGCCCAGACTT 60 TCTGGAGCTG AATCTCCAGT TGCCTCGGAG CCTCCTCAGA CTCAGTGTGGCCAGAATGGT 120 GGTCCTGGCT TCCCCTCGGG CCTGCCCTTC TGCCTCCTTC TGCACCCTGAGATGGTCATC 180 AGCTTTTCTC CCACTGCTGC CCTGTATGCA GGGAAGGCCT GCCTGTGGCTGTATCTGTAG 240 TACTTCTTGA ATGTGTTTCC TTCTCCCCCA GGCCAGAGCT GAGAATGAGGCGATTTCGGA 300 GGATGGAGAA ATAGCCCCGA GTCCCGTGGA AAATGAGGCC GGCGGACTTGCTGCAGCTGG 360 TGCTGCTGCT CGACCTGCCC AGGGACCTGG GCGGAATGGG GTGTTCGTCTCCACCCTGCG 420 AGTGCCATCA GGAGGAGGAC TTCAGAGTCA CCTGCAAGGA TATTCAACGCATCCCCAGCT 480 TACCGCCCAG TACGCAGACT CTGAAGCTTA TTGAGACTCA CCTGAGAACTATTCCAAGTC 540 ATGCATTTTC TAATCTGCCC AATATTTCCA GAATCTACGT ATCTATAGATCTGACTCTGC 600 AGCAGCTGGA ATCACACTCC TTCTACAATT TGAGTAAAGT GACTCACATAGAAATTCGGA 660 ATACCAGGAA CTTAACTTAC ATAGACCCTG ATGCCCTCAA AGAGCTCCCCCTCCTAAAGT 720 TCCTTGGCAT TTTCAACACT GGACTTAAAA TGTTCCCTGA CCTGACCAAAGTTTATTCCA 780 CTGATATATT CTTTATACTT GAAATTACAG ACAACCCTTA CATGACGTCAATCCCTGTGA 840 ATGCTTTTCA GGGACTATGC AATGAAACCT TGACACTGAA GCTGTACAACAATGGCTTTA 900 CTTCAGTCCA AGGATATGCT TTCAATGGGA CAAAGCTGGA TGCTGTTTACCTAAACAAGA 960 ATAAATACCT GACAGTTATT GACAAAGATG CATTTGGAGG AGTATACAGTGGACCAAGCT 1020 TGCTGGACGT GTCTCAAACC AGTGTCACTG CCCTTCCATC CAAAGGCCTGGAGCACCTGA 1080 AGGAACTGAT AGCAAGAAAC ACCTGGACTC TTAAGAAACT TCCACTTTCCTTGAGTTTCC 1140 TTCACCTCAC ACGGGCTGAC CTTTCTTACC CAAGCCACTG CTGTGCTTTTAAGAATCAGA 1200 AGAAAATCAG AGGAATCCTT GAGTCCTTGA TGTGTAATGA GAGCAGTATGCAGAGCTTGC 1260 GCCAGAGAAA ATCTGTGAAT GCCTTGAATA GCCCCCTCCA CCAGGAATATGAAGAGAATC 1320 TGGGTGACAG CATTGTTGGG TACAAGGAAA AGTCCAAGTT CCAGGATACTCATAACAACG 1380 CTCATTATTA CGTCTTCTTT GAAGAACAAG AGGATGAGAT CATTGGTTTTGGCCAGGAGC 1440 TCAAAAACCC CCAGGAAGAG ACTCTACAAG CTTTTGACAG CCATTATGACTACACCATAT 1500 GTGGGGACAG TGAAGACATG GTGTGTACCC CCAAGTCCGA TGAGTTCAACCCGTGTGAAG 1560 ACATAATGGG CTACAAGTTC CTGAGAATTG TGGTGTGGTT CGTTAGTCTGCTGGCTCTCC 1620 TGGGCAATGT CTTTGTCCTG CTTATTCTCC TCACCAGCCA CTACAAACTGAACGTCCCCC 1680 GCTTTCTCAT GTGCAACCTG GCCTTTGCGG ATTTCTGCAT GGGGATGTACCTGCTCCTCA 1740 TCGCCTCTGT AGACCTCTAC ACTCACTCTG AGTACTACAA CCATGCCATCGACTGGCAGA 1800 CAGGCCCTGG GTGCAACACG GCTGGTTTCT TCACTGTCTT TGCAAGCGAGTTATCGGTGT 1860 ATACGCTGAC GGTCATCACC CTGGAGCGCT GGTATGCCAT CACCTTCGCCATGCGCCTGG 1920 ACCGGAAGAT CCGCCTCAGG CACGCATGTG CCATCATGGT TGGGGGCTGGGTTTGCTGCT 1980 TCCTCCTCGC CCTGCTTCCT TTGGTGGGAA TAAGTAGCTA TGCCAAAGTCAGTATCTGCC 2040 TGCCCATGGA CACCGAGACC CCTCTTGCTC TGGCATATAT TGTTTTTGTTCTGACGCTCA 2100 ACATAGTTGC CTTCGTCATC GTCTGCTGCT GTTATGTGAA GATCTACATCACAGTCCGAA 2160 ATCCGCAGTA CAACCCAGGG GACAAAGATA CCAAAATTGC CAAGAGGATGGCTGTGTTGA 2220 TCTTCACCGA CTTCATATGC ATGGCCCCAA TCTCATTCTA TGCTCTGTCAGCAATTCTGA 2280 ACAAGCCTCT CATCACTGTT AGCAACTCCA AAATCTTGCT GGTACTCTTCTATCCACTTA 2340 ACTCCTGTGC CAATCCATTC CTCTATGCTA TTTTCACCAA GGCCTTCCAGAGGGATGTGT 2400 TCATCCTACT CAGCAAGTTT GGCATCTGTA AACGCCAGGC TCAGGCATACCGGGGGCAGA 2460 GGGTTCCTCC AAAGAACAGC ACTGATATTC AGGTTCAAAA GGTTACCCACGACATGAGGC 2520 AGGGTCTCCA CAACATGGAA GATGTCTATG AACTGATTGA AAACTCCCATCTAACCCCAA 2580 AGAAGCAAGG CCAAATCTCA GAAGAGTATA TGCAAACGGT TTTGTAAGTTAACACTACAC 2640 TACTCACAAT GCTAGGGGAA CTTACAAAAT AATAGTTTCT TGAATATGCATTCCAATCCC 2700 ATGACACCCC CAACACATAG CTGCCCTCAC TCTTGTGCAG GCGATGTTTCAATGTTTCAT 2760 GGGGCAAGAG TTTATCTCTG GAGAGTGATT AGTATTAACC TAATCATTGCCCCCAAGAAG 2820 GAAGTTAGGC TACCAGCATA TTTGAATGCC AGGTGAAATC AAAATAATCTACACTATCTA 2880 GAAGACTTTC TTGATGCCAA GTCCAGAGAT GTCATTGTGT AGGATGTTCAGTAAATATTA 2940 ACTGAGCTAT GTCAATATAG AGCTTCTCAG TTTTGTATAA CATTTCATACTAAAGATTCA 3000 GCAAATGGAA AATGCTATTA ATTTGGTTGG TGACCACAAG ATAAAATCAGTCCCACGTTG 3060 GCTCAGTTCA ACTAGATGTT CCCTGATACA AAGAGAACTT GATTTCCTTAAAACTGAAAA 3120 GCCAAACACA GCTAGCTGTC ATACAAGAAA CAGCTATTAT GAGACATGAAGGAGGGTAAG 3180 AATTAGCTTT AAGTTTTGTT TTGCTTTGTT TTGTTTTTTA ACTCAACCTATTAATCATCT 3240 CTTCACAAGA ATCCACCTGA TGTGACCAAG CTATTATGTG TTGCCTGGAAAAACTGGCAA 3300 GATTTCAGCT TATGTGGCCT AGCAAACTAA GAATTGCTCT TCTTGGCCAGCCTCATAGCA 3360 TAAAAGATGT GAACTCTAGG AAGTCTTTCT CAGTAGCAAT AAGTGGGAATTATGGGCAGA 3420 GCACACTCAA TCCCCTGTTG ATTAATAAAA CAGGCTGGAC ACTAATTAACTATGGGACTT 3480 AAATCTGTAG AAATGAAGGA GTCCAATAGC TTCTTCCAAT TTTAAAACTCTAGTACATCC 3540 CTTTCCCTCA AATATATATT TCTAAGATAA AGAGAAAGAA GAGCACTAAGTAAGTAGAAT 3600 CTGTTTTTCC TATTTTGTAG GGCTGCTGAC TCCTAGTCCT TGAAGCTTAGACACATGACC 3660 CAGGAAATTT TCCTTTGTTT CACTTTTGAT TATGATGTCT GAGCCAAAAA3710

What is claimed is:
 1. Process for the quantitative detection of TSH orof anti-TSHr antibodies comprising the steps of: contacting intact cellsoperationally transformed by a vector comprising a cDNA sequenceencoding the amino acid sequence represented in SEQ ID NO:50 or membranepreparations of such cells with biological sample suspected ofcontaining TSH or anti-TSHr antibodies; measuring in the intact cells ormembranes the change in adenylyl cyclase activity; and correlatingresults from the measuring step to the presence of TSH or anti-TSHrantibodies.
 2. The process according to claim 1 wherein the cDNAsequence is represented in SEQ ID NO:62.
 3. Process for the quantitativedetection of TSH or of anti-TSHr antibodies comprising the steps of:contacting intact cells operationally transformed by a vector comprisinga cDNA sequence encoding the amino acid sequence represented in SEQ IDNO:59, or membrane preparations of such cells with a biological samplesuspected of containing TSH or anti-TSHr antibodies; measuring adenylylcyclase as an indicator of the activating effect of TSH or by “blocking”anti-TSHr antibodies present in the biological sample; and correlatingresults from the measuring step to the presence of TSH or anti-TSHrantibodies.
 4. The process according to claim 3 wherein the cDNAsequence is represented in SEQ ID NO:62.
 5. A biologically activepreparation of human TSH receptor in the form of an isolated recombinantpolypeptide expressed by a transformed host cell, said polypeptidecomprising the amino acid sequence set forth in SEQ ID NO:59, and beingfree of impurities associated with detergent-solubilized thyroidmembrane preparations.
 6. An isolated cDNA encoding the polypeptideaccording to claim
 5. 7. The isolated nucleotide sequence according toclaim 6 characterised in that the sequence is a DNA sequence having thesequence listed in SEQ ID NO:62 (shown in FIG. 12).
 8. Process for thepreparation of a polypeptide according to claim 5, comprising the stepsof: inserting a vector, which operationally contains a cDNA sequenceencoding the polypeptide, into a host cell such that the cell istransformed; and expressing said nucleic acid to obtain saidpolypeptide.
 9. Process for the quantitative detection ofanti-thyrotropin receptor antibodies (anti-TSHr) in a biological samplecomprising the steps of: contacting a polypeptide according to claim 5with the biological sample suspected of containing anti-TSHr antibodies,incubating with labelled TSH, or with labelled anti-TSHr antibodies;measuring the remaining, bound labelled TSH or bound labelled anti-TSHrantibodies, after competition between the labelled and unlabelledspecies; and correlating results from the measuring step to the presenceof anti-TSHr antibodies.
 10. Kit for the detection of anti-TSHrantibodies characterized in that it contains: a) Polypeptide accordingto claim 5, having thyrotropin receptor activity and being either in animmobilised or detergent-solubilised form; b) at least one of thefollowing reagents: i) labelled TSH ii) labelled anti-TSHr antibodies.11. Kit according to claim 10, wherein the polypeptide is present in theform of intact cells previously operationally transformed by a vectorcomprising a cDNA sequence encoding said polypeptide and consequentlybearing said polypeptide in their membranes, or in the form ofdetergent-solubilized membranes of such cells.